Skip to main content

Architecture

Human entry point. Pair with the AI-agent entry point at AGENTS.md and the doc map at /intro.

Cold-start path: /intro/quickstart. Deployment path: how-to/operations/local-setup or how-to/operations/kubernetes-deploy.

AlphaSwarm is a local-first, agentic quantitative research and trading platform. Every LLM call, every backtest, every reinforcement-learning rollout, and every piece of metadata stays on local hardware — no proprietary alpha leaves the box. The codebase distills patterns from Microsoft Qlib, AI4Finance FinRL, QuantConnect Lean, OpenBB, vnpy, and TradingAgents into one coherent platform.

The platform is organised around four invariants that hold across every subsystem:

  1. Hash-locked spec runtimes. AgentSpec, BotSpec, RLExperimentSpec, and AnalysisSpec each have a single sanctioned executor (AgentRuntime / BotRuntime / RLRuntime / AnalysisRuntime). Any spec change creates a new immutable *_spec_versions row; old versions stay forever for replay.
  2. Medallion lakehouse. Every Iceberg write goes through iceberg_catalog.append_arrow with a declared bronze / silver / gold layer; agents read through data.* MCP tools, never raw ORM.
  3. One LLM gateway, one progress bus. Every model call routes through router_complete; every Celery task emits canonical progress frames through alphaswarm.tasks._progress.
  4. Topology is data, not code. Service URLs, MCP audiences, and credential references resolve through alphaswarm_platform/configs/deployment/topology.yaml.

System component diagram

Solid lines are default-profile data paths; dotted lines are opt-in / asynchronous.

The four edge surfaces

AlphaSwarm exposes four hostnames, each behind its own Cloudflare property:

  • alpha-swarm.ai — operator UI (alphaswarm_client). Vite + React 19 + Tailwind 4 + shadcn/ui. Routes the topbar KillSwitch, paper trading dashboards, RL Lab, Analysis Lab, Workflow Studio, Data Hub.
  • api.alpha-swarm.ai — public API (alphaswarm/api). FastAPI gateway, 30+ route modules, Stripe-style date epochs (first epoch 2026-06-01).
  • manage.alpha-swarm.ai — control plane (alphaswarm_controller). Workload lifecycle, TerraformRuntime, IdP wiring. Never imports alphaswarm.*.
  • docs.alpha-swarm.ai — documentation (alphaswarm_docs). Docusaurus 3 on Cloudflare Pages. Pages Functions for content-negotiation, sanitised page fragments, and the "Was this helpful?" feedback loop. Standalone MCP Worker at /mcp (RFC 9728 + 8707 compliant per AGENTS rule 49).

Plus two adjacent zones:

  • status.alpha-swarm.ai — Instatus status page. Separate Cloudflare zone so it stays up when the cluster is degraded.
  • archive.alpha-swarm.ai — frozen Stripe-style API epochs after the 12-month sunset window.

Request lifecycle

Every spec-driven dispatch — backtest, agent run, RL training, analysis flow, workflow — follows the same canonical shape. The two new contracts since the prior version of this doc:

  • Hash-lock first. Before any work happens, the runtime computes the spec's SHA-256, looks for a matching *_spec_versions row, inserts a new immutable row if the content changed.
  • Kill switch reachable. Every long-running runtime is in the topbar KillSwitch fan-out list. The runtime checks should_halt on every step.

The frame envelope is {task_id, stage, message, timestamp, **extras} per AGENTS rule 4. The should_halt check makes every spec-runtime an immediate stop target for the topbar kill switch.

Repository map

The monorepo is organised by responsibility. Each top-level package has its own AGENTS.md enforcing strict boundaries; cross-package imports are blocked in CI.

PackageRoleOwnerPublic-surface contract
alphaswarm/Quant runtime — strategies, backtests, agents, RAG, Icebergplatform-teamalphaswarm/api/main.py::create_app
alphaswarm_controller/Workload lifecycle + Terraform driver + provider adaptersplatform-teamalphaswarm_controller/main.py::create_app; NEVER imports alphaswarm.*
alphaswarm_core/Shared value types, ABCs, auth/resource filters, topologyplatform-teamDependency-light; consumed by both alphaswarm/ and alphaswarm_controller/
alphaswarm_client/Active Vite + React 19 operator UI at alpha-swarm.aiplatform-teampnpm --filter alphaswarm_client dev
alphaswarm_ui/Cloud-hosted Next.js PaaS frontend (dual Auth0 + Entra)platform-teamNever imports alphaswarm.* / alphaswarm_controller.*
alphaswarm_admin/Internal admin at manage.alpha-swarm.ai (audit-first)platform-teamMirrors alphaswarm_controller boundary
alphaswarm_rl/RL stack — RLExperimentSpec + RLRuntime + Iceberg trajectoriesrl-teamLegacy alphaswarm.rl.* is a deprecation shim
alphaswarm_models/ML framework, custom model serving (vLLM + Ollama), AlphaBacktestExperimentml-teamLegacy alphaswarm.ml.* + alphaswarm/llm/{vllm_runner,ollama_client}.py are deprecation shims
alphaswarm_bots/Bot templates + BotRuntime (smallest deployable unit)agentic-teamYAML at alphaswarm_bots/templates/{trading,research}/
alphaswarm_ide/Theia 1.72 IDE + six AlphaSwarm extensionsplatform-teamCanonical entrypoint: alphaswarm-cli ide
alphaswarm_cli/Standalone operator CLI (HTTP-only, device-flow auth)platform-teamNever imports alphaswarm.* / alphaswarm_controller.*
alphaswarm_platform/Hosted-platform deployment + IaC + build assetsinfra-teamNo import alphaswarm.*; TerraformRuntime-only
alphaswarm_index/Curator-owned single source of truthdocs-teamSole-writer is the alphaswarm-index-curator subagent
alphaswarm_docs/This site (Docusaurus 3 on Cloudflare Pages)docs-teamQuality gates in .github/workflows/docs-ci.yml
alphaswarm_snippets/Curated knowledge + extractions + inspiration treesdocs-teamRuntime code MUST NOT import this tree

Inside alphaswarm/ the subsystems map one-to-one to concept docs:

alphaswarm/<package>/Doc
agents/agentic-pipeline, agents, workflow-studio, multi-agent-patterns
analysis/analysis-framework, analysis-lab, analysis-flows
api/reference/api (auto-generated)
backtest/backtest-engines, vbtpro-integration, hft-backtest
cli/providers
codebase/codebase-mcp
core/core-types
data/data-plane, data-catalog, data-mcp, datasets-catalog, data-discovery, airbyte-builder, dagster-sandbox
llm/providers, sera
persistence/domain-model, erd, reference/data-dictionary
providers/data-plane
risk/paper-trading
streaming/streaming, streaming-admin, live-market
tasks/agent-watchdog
trading/paper-trading, paper-metadata-gate
ws/observability
ui/Deprecated (legacy Solara) — rollback only

For the full canonical repository-split contract (boundaries, import guards, future extraction map) read repository-split. For the file-by-file path contract for cross-repo references read alphaswarm-monorepo-paths.

Hard rules (cardinal subset)

Every contributor reads the full 55 hard rules in AGENTS.md. The cardinal subset that surfaces in this doc:

  • Rule 1. Symbol.parse(vt_symbol) only. Never split a vt_symbol on ..
  • Rule 2. All LLM calls go through router_complete.
  • Rule 3. All Iceberg writes go through iceberg_catalog.append_arrow.
  • Rule 4. All progress emits use the canonical frame envelope.
  • Rule 5. All cross-task state goes through Postgres; never pickle ORM objects.
  • Rule 12-19, 23-25, 40-41. The five spec runtimes (AgentRuntime, BotRuntime, RLRuntime, AnalysisRuntime, WorkflowRuntime) are the only sanctioned executors for their respective specs. Specs are immutable once committed; behaviour changes always create a new version row.
  • Rule 22. Agents NEVER read Postgres / Iceberg directly. Every catalog / dataset / entity read goes through a registered DataMCPTool.
  • Rule 42-45. TerraformRuntime owns all terraform apply; WorkloadRuntime owns all runtime workload ops; both write to the workload_runs + terraform_runs audit ledgers before executing.
  • Rule 47. Service URLs resolve through the topology service; AlphaSwarm is cluster-agnostic.
  • Rule 49. Every MCP server is RFC 9728 + 8707 conformant.
  • Rule 52. Step-up MFA (RFC 9470) on every halt + every destructive surface.

Worked example: trace your first request

Goal: dispatch a backtest, watch the WebSocket frames, inspect the ledger row and the Iceberg gold output — without leaving this page.

Step 1 — dispatch

The example below targets your local compose stack at http://localhost:8000. Hit "Run" to fire a sample momentum backtest.

Step 2 — tail the WebSocket

Switch to your terminal and tail the canonical progress frames:

curl -N http://localhost:8000/chat/stream/<task_id>

You will see frames in the {task_id, stage, message, timestamp, **extras} shape. Stages: start → bar.processed (×N) → done (carries the final BacktestResult).

Step 3 — inspect the ledger

Pyodide can run this synchronous SQL via DuckDB against a small parquet snapshot of backtest_runs:

When pointed at the real platform, replace the inline list with a /data/exports MCP call and the same SQL works against the actual ledger snapshot.

Step 4 — read the Iceberg gold output

from pyiceberg.catalog import load_catalog
cat = load_catalog("alphaswarm")
table = cat.load_table(f"alphaswarm_gold_backtests.run_{run_id}")
df = table.scan().to_pandas()
print(df[["timestamp", "equity", "drawdown"]].tail(10))

Step 5 — verify

  • A backtest_runs row with non-NULL sharpe exists.
  • The WebSocket emitted a stage=done frame with the same run_id.
  • An alphaswarm_gold_backtests.run_<run_id> Iceberg table is queryable.
  • The KillSwitch topbar element shows a green status.

What next

Deployment modes

docker-compose (default)

docker compose up -d

Brings up redis, postgres, alphaswarm-core, alphaswarm-worker, alphaswarm-beat, alphaswarm-client, chromadb, mlflow, otel-collector, jaeger. The Iceberg catalog runs in PyIceberg SQL mode against the host bind mount under data/iceberg/. Optional profiles:

  • --profile streaming — adds Redpanda + Flink for live market data.
  • --profile vllm — adds a containerised vLLM inference server.
  • --profile legacy — restores the older MinIO + iceberg-rest topology for rollback only.

Native dev (no Docker)

pip install -e ".[full,dev]"
alembic upgrade head
uvicorn alphaswarm.api.main:app --reload
celery -A alphaswarm.tasks.celery_app worker --loglevel=info

Kubernetes

make deploy-k8s ENV=prod

Manifests live under alphaswarm_platform/deployments/kubernetes/. The TerraformRuntime owns every terraform apply; see how-to/operations/kubernetes-deploy and how-to/operations/alphaswarm-fund-blue-green-cutover.

Cloudflare Pages (docs only)

docs.alpha-swarm.ai deploys via the cloudflare_pages_docs Terraform module — out of cluster, on the edge, behind Cloudflare Access for /internal/* and /enterprise/*.

Where to start

If you want to...Read
Get the platform running locallyintro/quickstart
Understand the doc conventionsintro/conventions
See the canonical repository layoutrepository-split
Run a backtest end-to-endtutorials/first-backtest
Promote a bot from backtest to papertutorials/first-bot
Train an RL agenttutorials/first-rl-experiment
Compose an agent workflowtutorials/first-agent-workflow
Browse the API surfacereference/api
Browse the Python surfacereference/python
Inspect tables and columnsreference/data-dictionary
Author a new strategyhow-to/recipes/add-a-strategy
Query data without touching ORMhow-to/recipes/query-data-via-mcp
Snapshot an agent spechow-to/recipes/snapshot-an-agent-spec
Trigger a kill switchhow-to/operations/kill-switch-incident-response
Deploy to Kuberneteshow-to/operations/kubernetes-deploy
Read the agentic-coding contractconcepts/agentic/agentic-development
Run docs from an AI agent/llms.txt, /llms-full.txt, /mcp

Deeper reads