Glossary
Project-specific jargon used across AlphaSwarm, with a definition and a pointer to the canonical file. New contributors and AI agents should treat this as the single source of truth for terminology — if you find a mismatch between this glossary and the code, file an issue.
See also: alphaswarm_docs/index.md for the full doc map.
Core domain
vt_symbol— Composite symbol id with the shape{TICKER}.{EXCHANGE}(vnpy convention), e.g.AAPL.NASDAQ,BTCUSDT.BINANCE,ESM4.CME. Always created viaSymbol.parse(...)/Symbol.format(...)in alphaswarm/core/types.py; never hand-split.Symbol— Immutable dataclass that bundlesticker,exchange,asset_class,security_type, optional contract spec. The atom flowing through every data feed, strategy, and broker. Defined in alphaswarm/core/types.py.AssetClassvsSecurityType—AssetClassis the broad category (equity,crypto,fx,future,option,index,commodity,bond).SecurityTypeis the Lean-style finer-grained enum (equity,option,future_option,crypto_future,index_option, …). The_polymorphic_identity_forhelper in alphaswarm/data/catalog.py mapsSecurityTypeto a joined-table subclass ofInstrument.Resolution— Lean-style bar cadence (Tick,Second,Minute,Hour,Daily); see alphaswarm/core/types.py.Interval— Short-code bar cadence (vnpy style,1m,5m,1h,1d). Same idea asResolution, kept for vnpy back-compat.SubscriptionDataConfig— The data-plane routing key. CombinesSymbol + Resolution + TickType + DataNormalizationMode. See alphaswarm_docs/core-types.md.
Persistence + data plane
- Execution Ledger — The Postgres tables under alphaswarm/persistence/models.py + alphaswarm/persistence/ledger.py that record every signal, order, fill, agent decision, and backtest run. Authoritative for "what did the system actually do?".
LedgerWriter— Façade over the ledger tables. Always go through it instead of writing to ORM models directly so audit messages get emitted. alphaswarm/persistence/ledger.py.Instrumentjoined-table inheritance —instrumentsis the parent table; each subclass (InstrumentEquity,InstrumentOption, …) lives in its own joined-table row keyed oninstruments.id. Theinstrument_classdiscriminator selects the subclass at load time. See alphaswarm_docs/erd.md and alphaswarm/persistence/models_instruments.py.polymorphic_identity— SQLAlchemy mapper arg that ties a subclass to a discriminator value (e.g.InstrumentEquity.__mapper_args__ = {"polymorphic_identity": "spot"}). When you add a new instrument subclass you must also extend themappingdict in_polymorphic_identity_for.DatasetCatalog— Parent row describing a logical dataset (HMDA LAR, FDA device events, etc.) with provider/domain/tags.DatasetVersion— Per-materialisation row beneathDatasetCatalog. Captures row count, dataset hash, schema snapshot, Iceberg identifier.DataLink— Edge between aDatasetVersionand an entity (Instrument,Issuer,EconomicSeries). Use this for "which symbols does this dataset cover?" queries.DataSource— Logical provider record (Yahoo, Alpha Vantage, IBKR, openFDA). Datasets and data-links reference aDataSource.IcebergCatalog(the wrapper) — PyIceberg handle from alphaswarm/data/iceberg_catalog.py. Always go throughappend_arrow,read_arrow,iceberg_to_duckdb_view; never call PyIceberg'sCatalog.create_tabledirectly.aqp_<source>namespace — Iceberg namespace convention for the regulatory ingest:alphaswarm_cfpb,alphaswarm_uspto,alphaswarm_fda,alphaswarm_sec. New corpora pick a newaqp_<source>slug.- Persistent host warehouse —
C:/alphaswarm-warehouseon Windows, bind-mounted intoalphaswarm-apiandalphaswarm-workerat/warehouse. Holds the PyIceberg SQL catalog (catalog.db), Parquet data files, staging dir, and ingest audit logs. See alphaswarm_docs/data-catalog.md. legacyprofile — Docker Compose profile that bundles the older REST + MinIO catalog topology (off by default). Bring it up withdocker compose --profile legacy up -d.
Strategies + backtest
BaseStrategy— Abstract strategy contract under alphaswarm/strategies/. Subclasses implementon_bar,on_signal, etc. See alphaswarm_docs/backtest-engines.md.MLAlphaStrategy/MLSelectorAlpha— Strategies that wrap an ML model (deployed viaModelDeployment) and emit signals.EnsembleAlpha— Weighted combination of multiple alphas. alphaswarm/strategies/ml_alphas.py.IBrokerage/IDataQueueHandler— Lean-style interfaces consumed by backtest, paper, and live engines without modification (the same strategy code runs against all three). See alphaswarm_docs/paper-trading.md.BacktestRun— Postgres row describing one backtest invocation (Sharpe, Sortino, drawdown, MLflow run id, dataset hash). The backtest UI's history view is just a query against this table.MLflow run id— Foreign id stored onBacktestRun.mlflow_run_idpointing at the MLflow tracking server. Click-through from the UI opens the MLflow UI in a new tab.dataset_hash— Deterministic SHA-256 of the input bars used in a backtest. Lets the UI flag "two backtests with the same hash = identical inputs".
ML + agents
- Tier (
deep/quick) — Two LLM tiers in the agentic crews.deep= high-capability (Nemotron 70B / GPT-4-class) for analysis;quick= small/fast (Llama 3.2 / Mini) for control-flow decisions. Provider per tier is insettings.llm_provider_deep/_quick; model per tier inllm_deep_model/llm_quick_model. router_complete— One-shot LLM completion through LiteLLM exposed by alphaswarm/llm/providers/router.py. All AlphaSwarm code goes through this — never calllitellm.completionor the Ollama client directly.Director— Nemotron-driven planner + verifier in alphaswarm/data/pipelines/director.py. Sits between discovery and materialisation in generic file ingestion.IngestionPlan/PlannedDataset— Director output dataclass. OnePlannedDatasetper discovered family with target namespace, table name, expected_min_rows, domain hint, and skip list.VerifierVerdict— Director's post-materialise judgement (acceptorretrywith adjusted knobs).__assets__family — SyntheticDiscoveredDatasetcarrying the non-tabular inventory (PDFs, XML, images) found during discovery. Never materialised; surfaced underIngestionReport.extrasfor visibility.AgentDecision/DebateTurn— Agent crew audit trail rows.CrewRun— One full agentic crew invocation (planner → research → execution sub-agents).Alpha158— Microsoft Qlib's 158-feature factor zoo, ported to AlphaSwarm under alphaswarm/data/indicators_zoo.py.FeatureSet/FeatureSetVersion— Composable feature spec (list ofIndicatorZooexpressions + transformations) versioned in Postgres, materialised on demand.ModelDeployment/MLDeployment— A trained ML model that has been registered for inference (rows in alphaswarm/persistence/models.py).
Bots
Bot— Smallest self-contained, deployable unit on AlphaSwarm. Aggregates a universe + data pipeline + strategy + backtest engine + optional ML deployments + optional agent specs + RAG plan + metrics- risk caps + deployment target. Lives under a
Projectand is uniquely identified by(project_id, slug). See alphaswarm_docs/bots.md.
- risk caps + deployment target. Lives under a
BotSpec— Pydantic blueprint for a bot. Hashed viasnapshot_hash()to drive immutablebot_versionssnapshots. Defined in alphaswarm/bots/spec.py.TradingBot/ResearchBot— Bot subclasses selected byBotSpec.kind.TradingBotdoes backtest / paper / deploy;ResearchBotdoes chat (and optional backtest if astrategyblock is set).BotRuntime— Single sanctioned execution entry point for any bot lifecycle action. Snapshots specs intobot_versions, opensbot_deploymentsrows, and emits progress through alphaswarm/tasks/_progress.py.bot_versions— Immutable, hash-locked spec snapshots (mirrorsagent_spec_versions). Never mutated in place.bot_deployments— Ledger of every backtest / paper / chat / k8s invocation for a bot. References theBotVersionthat produced it so a run can be replayed.- Deployment target (
paper_session/kubernetes/backtest_only) — Selected viaBotSpec.deployment.target. Backed byalphaswarm/bots/deploy.py::DeploymentDispatcher.
Provider catalog
LLMProvider— Lightweight handle around a LiteLLM provider spec. Registered in alphaswarm/llm/providers/catalog.py::PROVIDERS.ProviderSpec— Static config for a provider slug (LiteLLM prefix, env-var name, default models).vllmprovider — OpenAI-compatible vLLM endpoint behind LiteLLM'sopenai/adapter. EmptyALPHASWARM_VLLM_BASE_URLdisables.nemotron-3-nano:30b— Default Director model on Ollama (NVIDIA Nemotron Nano v3, 31.6B params). Pull withollama pull nemotron-3-nano:30b. Configurable viaALPHASWARM_LLM_DIRECTOR_MODEL.
Streaming + live
KafkaDataFeed— In-process Kafka consumer that hands bars/quotes to theIDataQueueHandlerinterface.features.indicators.v1,market.bar.v1, … — Versioned Kafka topics. Naming pattern is<domain>.<entity>.v<n>.StreamingIngester—alphaswarm-stream-ingestCLI that publishes to Kafka topics from Alpaca / IBKR.- Heartbeat / kill-switch — Periodic Redis publish from the paper-
trading session; absence triggers the runner to halt.
ALPHASWARM_RISK_KILL_SWITCH_KEY(defaultalphaswarm:kill_switch).
Observability
- OTEL endpoint —
ALPHASWARM_OTEL_ENDPOINT(default empty disables). When set, every Celery task and HTTP request emits OpenTelemetry spans via alphaswarm/observability/. - Progress bus — Redis pub/sub channel
alphaswarm:task:<task_id>carrying{stage, message, timestamp, **extra}payloads. UIs subscribe via the WebSocket relay at/chat/stream/{task_id}. See alphaswarm/ws/broker.py and alphaswarm/tasks/_progress.py.
Configuration
settings— CachedSettingsinstance from alphaswarm/config.py. Always import asfrom alphaswarm.config import settingsand never constructSettings()directly — the cache backslru_cache(maxsize=1).ALPHASWARM_*env namespace — Every settable knob takes theALPHASWARM_prefix. Bools accepttrue/false/1/0. Paths are resolved by_coerce_path.host-downloads—/host-downloads:robind mount inalphaswarm_platform/compose/docker-compose.ymlexposing the user's localDownloads/directory for CLI ingest jobs.
Inspiration rehydration (Phase 2026-04-29)
- Microprice —
(P_ask * Q_bid + P_bid * Q_ask) / (Q_bid + Q_ask). Volume-weighted refinement of mid-price; converges to the deeper side of the book. Implemented in alphaswarm/data/microstructure.py. - OBI (Order Book Imbalance) —
(Q_bid - Q_ask) / (Q_bid + Q_ask), range[-1, +1]. Positive = bid-side pressure. Used as a quote skew signal in the LOB market-making strategies under alphaswarm/strategies/hft/. - VPIN — Volume-synchronized probability of informed trading (Easley/López/O'Hara). Re-buckets trade flow by equal-volume buckets; rolling mean of |buy-sell|/|buy+sell|. See alphaswarm/data/microstructure.py.
- Sample-aware Sharpe — Annualised Sharpe ratio that uses the actual sample frequency of a returns series instead of the assumed 252 trading days. Required for HFT strategies with sub-daily bars. See alphaswarm/backtest/hft_metrics.py.
- Walk-forward — Training scheme where the model is re-fit on a rolling (or anchored) window and tested on the immediately following slice. Implemented in alphaswarm/ml/walk_forward.py.
- Bachelier (Normal) model — Options pricing model assuming the
underlying follows arithmetic Brownian motion (
dF = sigma dW). Appropriate for low-priced or near-zero underlyings (rates, basis spreads). See alphaswarm/options/normal_model.py. - Inverse option — Option settled in the underlying asset (e.g. BTC) rather than quote currency (USD). Common on crypto venues like Deribit. See alphaswarm/options/inverse_options.py.
- Regime classifier — Lightweight classifier that labels each bar as trending vs ranging using ADX threshold (default 25) or as bull/bear/neutral via multi-MA slope vote. See alphaswarm/data/regime.py.
- Factor expression — Tiny Polars-based DSL covering Alpha101
primitives (
Ts_Mean,Ts_Std,Rank,Decay_Linear,Delta,Ts_Corr). See alphaswarm/data/factor_expression.py. - Engle-Granger cointegration — Two-step test for cointegrated pairs: OLS hedge ratio + ADF test on the residual. See alphaswarm/data/cointegration.py.
- Triple-barrier label — Lopez de Prado labeling: look forward
horizonbars, label+1if upper barrier hit first,-1if lower,0if horizon reached. See alphaswarm/data/labels.py. - Yang-Zhang volatility — OHLC vol estimator combining overnight, open-to-close, and Rogers-Satchell components. The most efficient of the OHLC family. See alphaswarm/data/realised_volatility.py.
- LobStrategy — ABC for limit-order-book strategies; subclasses
emit
OrderIntentlists in response toLobStateupdates. Engine integration is deferred — see alphaswarm_snippets/extractions/_FUTURE_PROMPTS/lob_adapter_prompt.md. - Dataset preset — Curated declarative spec for a one-click
ingestion (e.g.
intraday_momentum_etf,crypto_majors_intraday). See alphaswarm/data/dataset_presets.py. - Inspiration source — One of seven external repos under
alphaswarm_snippets/inspiration/from which strategies / models / agents were rehydrated. Tracked via thesourcekwarg onalphaswarm.core.registry.registerand surfaced as thesource:*tag.
Testing
tests/data/test_pipelines_smoke.py— Reference test for the Iceberg ingestion path. New ingest features should add a test in this directory.director_enabled=False— Pass when constructingIngestionPipelinein tests so the real LLM is bypassed in favour of the deterministic identity plan.
Cross-repo
agentic_assistants— Sibling repo providing the cross-system lineage API (ALPHASWARM_AGENTIC_ASSISTANTS_API).rpi_kubernetes— Sibling repo with the k8s deployment manifests under alphaswarm_platform/deploy/k8s/.