Entity Relationship Diagram
Pair with alphaswarm_docs/data-dictionary.md (column-level detail) and alphaswarm_docs/domain-model.md (narrative). Doc map: alphaswarm_docs/index.md.
The Postgres schema has ~110 ORM classes spread across 11 model files under alphaswarm/persistence/. One mega-ERD would be unreadable, so this doc breaks the schema into focused diagrams by domain. The final section is a global FK-only map showing only the cross-domain joins.
Each per-domain ERD lists table names with the primary key (PK) and a
short subset of columns. For full column lists, see
data-dictionary.md.
Global FK map
Cross-domain edges only — pick a starting table and trace where it fans out.
Core / Instruments
Joined-table inheritance. Every concrete instrument subclass shares the
parent instruments row and adds shape-specific columns in its own
table keyed on instruments.id. The discriminator is
instruments.instrument_class.
Market data lineage + Iceberg catalog
How AlphaSwarm tracks every dataset that flows into Iceberg. The
iceberg_identifier column on dataset_catalogs was added in
alembic/versions/0011_iceberg_catalog_columns.py.
Agentic + ML
Strategies, backtests, agent crews, ML deployments, and feature sets.
Ledger (signals / orders / fills / entries)
Every signal, order, fill, and free-form audit entry written by
LedgerWriter.
News / Events / Fundamentals
Macro / FRED / GDelt
Entities / Issuers / Ownership
Taxonomy
Free-form tagging for issuers, instruments, and themes.
Sessions / Chat / Optimization
The conversational + experimentation layer.
Bots
Tables introduced by the Bot Entity Refactor (Alembic
0020_bots).
(project_id, slug)is unique onbots.(bot_id, spec_hash)is unique onbot_versions(immutable snapshots).bot_deployments.targetis one ofpaper_session/kubernetes/backtest_only/chat/backtest.
Data layer expansion (sinks, producers, streaming links)
Tables introduced by the Data Pipelines Hub work (Alembic
0024_data_layer_expansion).
All four tables use ProjectScopedMixin.
Notes:
(project_id, name)is unique onsinksandmarket_data_producers.(sink_id, spec_hash)and(sink_id, version)are unique onsink_versions(mirrors thebot_versionspattern).(dataset_catalog_id, kind, target_ref, direction)is unique onstreaming_dataset_linksso the refresh_links task can be re-run idempotently.
ML alpha-backtest linkage (Alembic 0025)
The four new FKs on backtest_runs (added by Alembic 0025) close the
loop from a backtest result back to the trained model that produced
its alpha:
model_version_id— the registeredModelVersionrow.ml_experiment_run_id— theMLExperimentRunthat trained it.experiment_plan_id— theExperimentPlanlineage row.model_deployment_id— theModelDeploymentused to wire the model into the strategy viaDeployedModelAlpha.
Adding a new model
When you add a new ORM class:
- Add the class to the appropriate
alphaswarm/persistence/models_*.py(ormodels.pyfor cross-domain things). - Add an Alembic migration (
alembic revision --autogenerate -m "add foo"). Never edit a shipped migration. - Update alphaswarm_docs/data-dictionary.md with the new table's columns.
- Add the table to the relevant per-domain ERD above (or open a new one if it's a new domain).
- If it has FKs into other domains, add those edges to the global FK map at the top of this file.