Data Dictionary

Pair with alphaswarm_docs/erd.md (visual schema) and alphaswarm_docs/domain-model.md (narrative). Doc map: alphaswarm_docs/index.md.

Authoritative table-and-column reference for the AlphaSwarm persistence layer plus the Iceberg catalog. Updated whenever a model file or migration ships — see "Adding a new model" for the workflow.

Conventions

PK: primary key column.
FK: foreign key (→ table.column).
Type: SQLAlchemy column type. String(N) is VARCHAR(N) in Postgres. JSON is JSONB. DateTime is timezone-naive UTC.
Null: Y/N. Defaults are listed where present.
Notes: extra constraints, indexes, or invariants.

All id columns are String(36) UUIDs generated by _uuid() unless noted. All created_at / updated_at columns default to datetime.utcnow server-side.

1. Sessions + chat — models.py

`sessions`

The conversational shell that chat messages and agent runs live under.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
user	String(120)	N	–	default `"local"`
title	String(240)	Y	–	–
created_at	DateTime	N	–	default now
closed_at	DateTime	Y	–	–
meta	JSON	Y	–	default `{}`

`chat_messages`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
session_id	UUID	N	→ sessions.id	cascade delete
role	String(32)	N	–	`user
content	Text	N	–	–
created_at	DateTime	N	–	default now
meta	JSON	Y	–	default `{}`

2. Strategies + backtests — models.py

`strategies`

The top-level strategy header. Versions live in strategy_versions.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
name	String(120)	N	–	–
version	Integer	N	–	default 1
config_yaml	Text	N	–	full YAML config
created_at	DateTime	N	–	default now
created_by	String(120)	N	–	default `"system"`
status	String(32)	N	–	`draft
meta	JSON	Y	–	default `{}`

`strategy_versions`

Immutable YAML snapshot of a strategy.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
strategy_id	UUID	N	→ strategies.id	cascade
version	Integer	N	–	–
config_yaml	Text	N	–	–
author	String(120)	N	–	default `"system"`
created_at	DateTime	N	–	–
dataset_hash	String(64)	Y	–	bind to data version
notes	Text	Y	–	–

Index: ix_strategy_versions_strategy_version (strategy_id, version).

`strategy_tests`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
strategy_id	UUID	N	→ strategies.id	–
version_id	UUID	Y	→ strategy_versions.id	–
backtest_id	UUID	Y	→ backtest_runs.id	–
status	String(32)	N	–	default `pending`
start, end	DateTime	Y	–	window
sharpe, sortino, max_drawdown, total_return, final_equity	Float	Y	–	summary metrics
engine	String(64)	Y	–	–

`backtest_runs`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
strategy_id	UUID	Y	→ strategies.id	–
task_id	String(120)	Y	–	Celery task id
status	String(32)	N	–	default `pending`
start, end	DateTime	Y	–	window
initial_cash, final_equity	Float	Y	–	–
sharpe, sortino, max_drawdown, total_return	Float	Y	–	metrics
mlflow_run_id	String(120)	Y	–	links to MLflow UI
dataset_hash	String(64)	Y	–	–
metrics	JSON	Y	–	full metrics blob
error	Text	Y	–	–
model_version_id	UUID	Y	→ model_versions.id	Alembic 0025 — model that produced the alpha
ml_experiment_run_id	UUID	Y	→ ml_experiment_runs.id	Alembic 0025 — training run lineage
experiment_plan_id	UUID	Y	→ experiment_plans.id	Alembic 0025 — experiment plan lineage
model_deployment_id	UUID	Y	→ model_deployments.id	Alembic 0025 — deployment that wired the model into the strategy

`ml_alpha_backtest_runs`

Combined experiment row joining a training run to a downstream backtest. Persisted by AlphaBacktestExperiment.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
task_id	String(120)	Y	–	Celery task id
run_name	String(240)	N	–	default `alpha-backtest`
status	String(32)	N	–	`queued
ml_experiment_run_id	UUID	Y	→ ml_experiment_runs.id	–
backtest_run_id	UUID	Y	→ backtest_runs.id	–
model_version_id	UUID	Y	→ model_versions.id	–
model_deployment_id	UUID	Y	→ model_deployments.id	–
experiment_plan_id	UUID	Y	→ experiment_plans.id	–
mlflow_run_id	String(120)	Y	–	parent MLflow run id
dataset_hash	String(64)	Y	–	–
ml_metrics	JSON	Y	–	IC / RMSE / hit-rate / etc
trading_metrics	JSON	Y	–	Sharpe / Sortino / Calmar / etc
combined_metrics	JSON	Y	–	rolled-up scalar `score` + selected ML/trading keys
attribution	JSON	Y	–	conviction-vs-PnL attribution
params	JSON	Y	–	full input-config snapshot
error	Text	Y	–	–

`ml_prediction_audit`

Per-bar prediction audit for an alpha-backtest run. Opt-in via ALPHASWARM_ML_PREDICTION_AUDIT_ENABLED; capped at ALPHASWARM_ML_PREDICTION_AUDIT_MAX_ROWS rows per run.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
alpha_backtest_run_id	UUID	N	→ ml_alpha_backtest_runs.id	cascade
vt_symbol	String(40)	N	–	–
ts	DateTime	N	–	–
prediction	Float	N	–	–
label	Float	Y	–	–
position_after	Float	Y	–	–
pnl_after_bar	Float	Y	–	–

`optimization_runs`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
task_id	String(120)	Y	–	–
strategy_id	UUID	Y	→ strategies.id	–
run_name	String(240)	N	–	default `"sweep"`
method	String(32)	N	–	`grid
metric	String(64)	N	–	default `"sharpe"`
status	String(32)	N	–	`queued
n_trials, n_completed	Integer	N	–	–
best_trial_id	String(36)	Y	–	–
best_metric_value	Float	Y	–	–
parameter_space, base_config, summary	JSON	Y	–	–

`optimization_trials`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
run_id	UUID	N	→ optimization_runs.id	cascade
backtest_id	UUID	Y	→ backtest_runs.id	–
trial_index	Integer	N	–	–
parameters	JSON	Y	–	–
status	String(32)	N	–	–
metric_value, sharpe, sortino, total_return, max_drawdown, final_equity	Float	Y	–	–
error	Text	Y	–	–

3. Ledger — models.py

`signals`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
strategy_id	UUID	Y	→ strategies.id	–
backtest_id	UUID	Y	→ backtest_runs.id	–
vt_symbol	String(40)	N	–	indexed
direction	String(10)	N	–	`long
strength	Float	N	–	–
confidence	Float	Y	–	default 1.0
rationale	Text	Y	–	–

Composite index: ix_signals_symbol_ts (vt_symbol, created_at).

`orders`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
backtest_id	UUID	Y	→ backtest_runs.id	–
strategy_id	UUID	Y	→ strategies.id	–
vt_symbol	String(40)	N	–	indexed
side	String(8)	N	–	`buy
order_type	String(16)	N	–	`market
quantity, price	Float	varies	–	–
status	String(16)	N	–	default `submitting`
reference	String(120)	Y	–	`paper:<run_id>` for paper trading

`fills`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
order_id	UUID	Y	→ orders.id	–
vt_symbol	String(40)	N	–	indexed
side	String(8)	N	–	–
quantity, price	Float	N	–	–
commission, slippage	Float	Y	–	default 0

`ledger_entries`

The canonical audit trail. Every action goes through LedgerWriter.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
backtest_id	UUID	Y	→ backtest_runs.id	–
strategy_id	UUID	Y	→ strategies.id	–
entry_type	String(32)	N	–	`SIGNAL
level	String(16)	N	–	`debug
message	Text	N	–	–
payload	JSON	Y	–	default `{}`

Index: ix_ledger_type_ts (entry_type, created_at).

4. Instruments — models.py + models_instruments.py

`instruments` (parent)

The polymorphic root. instrument_class is the discriminator; subclass rows live in instrument_<class> tables keyed on instruments.id.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
vt_symbol	String(64)	N	unique	`Symbol.format()`
ticker	String(64)	N	indexed	–
exchange	String(32)	Y	–	–
asset_class	String(32)	Y	–	`equity
security_type	String(32)	Y	–	`equity
instrument_class	String(32)	Y	indexed	discriminator
issuer_id	UUID	Y	→ issuers.id	–
identifiers	JSON	Y	–	`{ticker, isin, cusip, …}`
sector, industry, region, currency	String	Y	–	–
tick_size, multiplier, min_quantity, max_quantity, lot_size	Float	Y	–	exchange specs
price_precision, size_precision	Integer	Y	–	–
is_active	Boolean	N	–	default true
tags	JSON	Y	–	default `[]`
meta	JSON	Y	–	default `{}`

Joined-table subclasses (`instrument_<class>`)

All share id PK that's also a FK to instruments.id. Each table adds shape-specific columns. For full column lists see models_instruments.py; the ERD in alphaswarm_docs/erd.md lists key columns per subclass.

Subclass table	Polymorphic identity	Distinctive columns
`instrument_equity`	`spot`	`isin`, `cusip`, `figi`, `lei`, `gics_sector`, `shares_outstanding`, `is_adr`
`instrument_etf`	`etf`	`inception_date`, `aum`, `expense_ratio`, `is_leveraged`, `replication`
`instrument_index`	`index`	`administrator`, `methodology`, `constituent_count`, `base_value`
`instrument_bond`	`bond`	`coupon`, `maturity`, `rating_sp`, `rating_moodys`, `callable`, `convertible`
`instrument_future`	`future`	`underlying`, `expiry`, `contract_size`, `cycle`, `delivery_month`
`instrument_option`	`option`	`strike`, `expiry`, `kind` (call/put), `style`, `occ_symbol`
`instrument_fx_pair`	`fx_pair`	`base_currency`, `quote_currency`, `pip_size`
`instrument_crypto`	`crypto_token`	`subtype`, `chain`, `contract_address`, `max_leverage`, `funding_interval`
`instrument_cfd`	`cfd`	`underlying`, `margin_rate`, `financing_rate`
`instrument_commodity`	`spot_commodity`	`grade`, `unit_of_measure`, `delivery`
`instrument_synthetic`	`synthetic`	`legs`, `leg_weights`, `formula`
`instrument_betting`	`betting`	`event_name`, `market_type`, `selection_id`
`instrument_tokenized_asset`	`nft`	`chain`, `contract_address`, `token_standard`

5. Dataset lineage — models.py

`dataset_catalogs`

Logical dataset descriptor. Iceberg-related columns added in migration 0011.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
name	String(160)	N	indexed	–
provider	String(80)	N	indexed	–
domain	String(120)	N	–	default `"market.bars"`
frequency	String(32)	Y	–	–
storage_uri	String(512)	Y	–	–
schema_json	JSON	Y	–	–
description	Text	Y	–	–
tags	JSON	Y	–	–
meta	JSON	Y	–	–
iceberg_identifier	String(240)	Y	indexed	`<ns>.<table>`
load_mode	String(32)	N	–	`managed
source_uri	String(1024)	Y	–	–
llm_annotations	JSON	Y	–	from `annotate_table`
column_docs	JSON	Y	–	–

Composite index: ix_dataset_catalog_name_provider.

`dataset_versions`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
catalog_id	UUID	N	→ dataset_catalogs.id (cascade)	–
version	Integer	N	–	default 1
status	String(32)	N	–	`active
as_of, start_time, end_time	DateTime	Y	–	–
row_count, symbol_count, file_count	Integer	N	–	default 0
dataset_hash	String(64)	Y	indexed	SHA-256 of inputs
materialization_uri	String(512)	Y	–	–
columns	JSON	Y	–	–
schema_json	JSON	Y	–	–
meta	JSON	Y	–	–

Composite index: ix_dataset_versions_catalog_version.

`data_sources`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
name	String	N	unique	`yfinance
kind	String	Y	–	`rest
base_url	String	Y	–	–
meta	JSON	Y	–	–

`data_links`

Edges between dataset versions and entities (instruments, series).

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
dataset_version_id	UUID	N	→ dataset_versions.id (cascade)	–
source_id	UUID	Y	→ data_sources.id	–
instrument_id	UUID	Y	→ instruments.id	–
entity_kind	String	N	–	`instrument
entity_id	String	N	–	–
coverage_start, coverage_end	DateTime	Y	–	–
row_count	Integer	Y	–	–
meta	JSON	Y	–	–

`identifier_links`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
instrument_id	UUID	Y	→ instruments.id	–
source_id	UUID	Y	→ data_sources.id	–
identifier_kind	String	N	–	`cik
identifier_value	String	N	–	–

`split_plans`, `split_artifacts`, `pipeline_recipes`, `experiment_plans`, `model_versions`, `model_deployments`

ML lineage tables. See full column lists in models.py (search for the class name). One-liner summary:

Table	Purpose
`split_plans`	Train/val/test split design (method, segments, FK to dataset_version)
`split_artifacts`	Materialised fold boundaries + index sets per split plan
`pipeline_recipes`	Preprocessing recipes (shared/learn/infer processors)
`experiment_plans`	Ties dataset_version + split + recipe + model config + status
`model_versions`	One row per trained MLflow registry version
`model_deployments`	Active inference deployments (one model_version may have many)

6. Agentic — models.py

`agent_runs`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
session_id	UUID	Y	→ sessions.id	–
task_id	String(120)	Y	indexed	–
crew	String(120)	N	–	–
status	String(32)	N	–	–
prompt	Text	N	–	–
result	JSON	Y	–	–
error	Text	Y	–	–
llm_model	String(120)	Y	–	–
token_usage	JSON	Y	–	–

`crew_runs`

Lightweight index for the Crew Trace UI.

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
task_id	String(120)	N	unique	–
crew_name	String(120)	N	–	default `"research"`
crew_type	String(32)	N	indexed	`research
status	String(32)	N	indexed	–
prompt	Text	N	–	–
session_id	String(36)	Y	indexed	–
agent_run_id	UUID	Y	→ agent_runs.id	–
result, events	JSON	Y	–	–
error	Text	Y	–	–
cost_usd	Float	N	–	default 0

`agent_decisions`, `debate_turns`, `agent_backtests`, `agent_judge_reports`, `agent_replay_runs`, `backtest_interrupts`

The agentic-backtest audit trail.

Table	Purpose
`agent_decisions`	One row per long/short/flat decision (links backtest, strategy, crew_run)
`debate_turns`	Multi-turn debate transcripts under a decision
`agent_backtests`	Crew-level metrics rolled up per backtest
`agent_judge_reports`	Judge LLM's evaluation of a backtest
`agent_replay_runs`	Replays of a judged backtest with adjusted prompts
`backtest_interrupts`	User pause/resume markers during a long backtest

7. Feature sets — models.py

`feature_sets`

Column	Type	Null	FK	Notes
id	UUID	N	–	PK
name	String	N	–	–
description	Text	Y	–	–
kind	String	N	–	`composite
specs	JSON	N	–	list of indicator/transformation strings
tags	JSON	Y	–	–
default_lookback_days	Integer	Y	–	–

`feature_set_versions`

Immutable snapshot keyed on content_hash so the same spec rendered twice deduplicates.

`feature_set_usages`

Records of which backtests / deployments consumed which feature-set versions (for reverse lineage).

8. Reports + paper — models.py

Table	Purpose	Key columns
`equity_reports`	Markdown equity research reports generated by the report-writer crew	`vt_symbol`, `cohort`, `markdown`, `cost_usd`
`paper_trading_runs`	One row per paper or live session	`brokerage`, `feed`, `last_heartbeat_at`, `bars_seen`, `orders_submitted`, `fills`, `state`
`rl_episodes`	Snapshot of an RL training episode	`run_id`, `episode`, `mean_reward`, `portfolio_value`

8a. Bots — models_bots.py

Tables introduced by the Bot Entity Refactor (Alembic 0020_bots). Mirror the proven agent_specs / agent_spec_versions / agent_runs_v2 pattern.

Table	Purpose	Key columns
`bots`	Logical bot row (latest active version of a named spec inside a project)	`id`, `name`, `slug`, `kind` (`trading
`bot_versions`	Immutable, hash-locked snapshot of every `BotSpec` change	`id`, `bot_id` FK, `version`, `spec_hash`, `payload`, `notes`, `created_by`, `(bot_id, spec_hash)` UNIQUE, `(bot_id, version)` UNIQUE
`bot_deployments`	One row per backtest / paper / chat / k8s invocation; references the version that produced it	`id`, `bot_id` FK, `version_id` FK, `target` (`paper_session

All three tables carry ProjectScopedMixin (owner_user_id, workspace_id, project_id).

9. News — models_news.py

Table	Key columns
`news_items`	`url`, `source`, `published_at`, `headline`, `body`
`news_item_entities`	`news_item_id`, `vt_symbol`, `entity_kind` (`instrument
`news_sentiments`	`news_item_id`, `scorer` (`finbert

10. Events — models_events.py

corporate_events is the parent; the per-type tables FK back to it.

Table	Key columns
`corporate_events`	`vt_symbol`, `event_type` (`earnings
`earnings_event_rows`	`event_id`, `eps_actual`, `eps_estimate`, `revenue_actual`
`dividend_event_rows`	`event_id`, `amount`, `ex_date`, `pay_date`
`split_event_rows`	`event_id`, `ratio`
`ipo_event_rows`	`event_id`, `offer_price`, `shares_offered`
`merger_event_rows`	`event_id`, `acquirer`, `target`, `terms`
`calendar_event_rows`	`event_id`, `event_kind`, `expected_time`
`analyst_estimates`	`vt_symbol`, `analyst`, `target_price`, `forecast_date`
`price_targets`	`vt_symbol`, `analyst`, `target_price`, `period`
`forward_estimates`	`vt_symbol`, `analyst`, `metric`, `value`
`regulatory_event_rows`	`event_id`, `regulator`, `summary`
`esg_event_rows`	`event_id`, `category`, `score`

11. Fundamentals — models_fundamentals.py

Table	Key columns
`financial_statements`	`issuer_id`, `period` (`Q
`financial_ratios`	`issuer_id`, `period_end`, `pe`, `pb`, `roe`, `roa`, `debt_to_equity`
`key_metrics`	`issuer_id`, `period_end`, `revenue`, `net_income`, `free_cash_flow`
`historical_market_caps`	`issuer_id`, `as_of`, `market_cap`
`revenue_breakdowns`	`issuer_id`, `period_end`, `segment`, `region`, `revenue`
`earnings_call_transcripts`	`issuer_id`, `call_date`, `content`
`management_discussion_analysis`	`issuer_id`, `period_end`, `mda_text`
`reported_financials`	`issuer_id`, `period_end`, `xbrl_payload`

12. Macro — models_macro.py

Table	Key columns
`economic_series`	`series_id` (`FRED:GDP`), `title`, `frequency`, `units`, `source`
`economic_observations`	`series_id`, `observation_date`, `value`
`cot_reports`	`report_date`, `instrument`, `positions`
`bls_series`	`series_id`, `title`, `frequency`
`treasury_rates`	`date`, `rate_3m`, `rate_2y`, `rate_10y`, `rate_30y`
`yield_curves`	`date`, `tenors`
`option_series`	`instrument_id`, `expiry`, `style`
`option_chain_snapshots`	`series_id`, `as_of`, `chain_payload`
`futures_curves`	`as_of`, `front_month`, `tenor_prices`
`market_holidays`	`exchange`, `date`, `name`
`market_status_history`	`exchange`, `as_of`, `status`

13. Entities + ownership — models_entities.py + models_ownership.py

Table	Key columns
`issuers`	`name`, `lei`, `country`, `entity_kind`
`government_entities`	`id` (PK_FK), `country_code`, `level`
`funds`	`id` (PK_FK), `fund_family`, `fund_type`
`sectors`	`code`, `name`
`industries`	`code`, `name`, `sector_id`
`industry_classifications`	`issuer_id`, `industry_id`, `as_of`
`entity_relationships`	`parent_id`, `child_id`, `kind`
`locations`	`issuer_id`, `country`, `city`
`key_executives`	`issuer_id`, `name`, `title`
`executive_compensation`	`executive_id`, `year`, `total_comp`
`insider_transactions`	`vt_symbol`, `insider_name`, `transaction_date`, `quantity`
`institutional_holdings`	`vt_symbol`, `holder_name`, `as_of`, `quantity`
`form_13f_holdings`	`filer_cik`, `vt_symbol`, `period_end`
`short_interest`	`vt_symbol`, `settlement_date`, `short_interest`
`shares_float_snapshots`	`vt_symbol`, `as_of`, `float_shares`
`politician_trades`	`politician`, `vt_symbol`, `trade_date`, `amount`
`fund_holdings`	`fund_id`, `vt_symbol`, `as_of`, `position`

14. Taxonomy — models_taxonomy.py

Table	Key columns
`taxonomy_schemes`	`name` (`GICS
`taxonomy_nodes`	`scheme_id`, `parent_id`, `code`, `label`
`entity_tags`	`node_id`, `entity_kind`, `entity_id`
`entity_crosswalks`	`from_kind`, `from_id`, `to_kind`, `to_id`

15. External-source indexes — models.py

Table	Purpose	Key columns
`fred_series`	FRED metadata index	`series_id`, `title`, `units`, `frequency`
`sec_filings`	SEC EDGAR filing index	`instrument_id`, `accession`, `form`, `filing_date`
`gdelt_mentions`	GDelt GKG mention index	`instrument_id`, `mention_time`, `gkg_payload`

Iceberg namespace conventions

Iceberg tables sit alongside the Postgres schema; their identifiers are stored in dataset_catalogs.iceberg_identifier for cross-lookup.

Namespace	Source
`alphaswarm`	Generic / default (fallback when no `--namespace` provided)
`alphaswarm_smoke`	Smoke-test namespace (scripts/iceberg_smoke.py)
`alphaswarm_cfpb`	CFPB regulatory ingest
`alphaswarm_uspto`	USPTO regulatory ingest
`alphaswarm_fda`	openFDA regulatory ingest
`alphaswarm_sec`	SEC quarterly data sets
`alphaswarm_bars`	(reserved) generic OHLCV cache
`alphaswarm_features`	(reserved) feature-set materialisations

Naming rules:

Namespace: aqp_<source>, lower-snake-case, ≤32 chars.
Table: lower-snake-case, ≤48 chars, descriptive nouns (hmda_lar, device_event, broker_dealers).
The Director (Nemotron) decides the final table name within these rules; identity-plan fallback uses the discovered family name as-is.

Layout:

C:/alphaswarm-warehouse/iceberg/
├── catalog.db                                # SQLite metadata
└── <namespace>/
    └── <table>/
        ├── data/00000-0-<uuid>.parquet
        ├── data/00001-0-<uuid>.parquet
        └── metadata/
            ├── 00000-<uuid>.metadata.json
            ├── 00001-<uuid>.metadata.json
            ├── <uuid>-m0.avro                # manifest list
            └── snap-<snap_id>-...avro        # snapshot

Snapshots are append-only — every append_arrow produces a new metadata.json revision. Old snapshots can be expired with PyIceberg's Table.expire_snapshots(...) (not exposed via API yet).

Updating this dictionary:

When you add an ORM column or a new table:

Update the corresponding section above.
If you added a table to a per-domain ERD scope, update alphaswarm_docs/erd.md too.
Cross-link the migration that introduced the change.

Conventions
1. Sessions + chat — models.py
- sessions
- chat_messages
2. Strategies + backtests — models.py
3. Ledger — models.py
4. Instruments — models.py + models_instruments.py
- instruments (parent)
- Joined-table subclasses (instrument_<class>)
5. Dataset lineage — models.py
6. Agentic — models.py
7. Feature sets — models.py
8. Reports + paper — models.py
8a. Bots — models_bots.py
9. News — models_news.py
10. Events — models_events.py
11. Fundamentals — models_fundamentals.py
12. Macro — models_macro.py
13. Entities + ownership — models_entities.py + models_ownership.py
14. Taxonomy — models_taxonomy.py
15. External-source indexes — models.py
Iceberg namespace conventions

Conventions​

1. Sessions + chat — models.py​

sessions​

chat_messages​

2. Strategies + backtests — models.py​

strategies​

strategy_versions​

strategy_tests​

backtest_runs​

ml_alpha_backtest_runs​

ml_prediction_audit​

optimization_runs​

optimization_trials​

3. Ledger — models.py​

signals​

orders​

fills​

ledger_entries​

4. Instruments — models.py + models_instruments.py​

instruments (parent)​

Joined-table subclasses (instrument_<class>)​

5. Dataset lineage — models.py​

dataset_catalogs​

dataset_versions​

data_sources​

data_links​

identifier_links​

split_plans, split_artifacts, pipeline_recipes, experiment_plans, model_versions, model_deployments​

6. Agentic — models.py​

agent_runs​

crew_runs​

agent_decisions, debate_turns, agent_backtests, agent_judge_reports, agent_replay_runs, backtest_interrupts​

7. Feature sets — models.py​

feature_sets​

feature_set_versions​

feature_set_usages​

8. Reports + paper — models.py​

8a. Bots — models_bots.py​

9. News — models_news.py​

10. Events — models_events.py​

11. Fundamentals — models_fundamentals.py​

12. Macro — models_macro.py​

13. Entities + ownership — models_entities.py + models_ownership.py​

14. Taxonomy — models_taxonomy.py​

15. External-source indexes — models.py​

Iceberg namespace conventions​