Skip to main content

Backtest engines

Doc map: intro · vbt-pro deep dive: vbtpro-integration · LOB / tick-replay: hft-backtest · Class hierarchy: class-diagram · Worked tutorial: tutorials/first-backtest · Recipe: how-to/recipes/run-a-backtest-from-yaml.

AlphaSwarm runs every backtest through one of seven interchangeable engines behind the BaseBacktestEngine ABC. The runner, persistence, MLflow tracking, and UI never branch on which engine produced a run — every engine returns the same BacktestResult.

The seven engines fall into three tiers so you can pick one without scanning a 7-row table every time:

Tier 1 — Vectorised primary (VectorbtProEngine)

Default for research workloads, parameter screens, walk-forward optimisation, factor studies, and any backtest that does not need per-bar Python.

Five constructor modes select the inner vbt-pro path:

  • signals — array-based entries / exits / sizing
  • orders — column-of-orders DataFrame
  • optimizer — built-in vbt-pro Param sweeps
  • holding — buy-and-hold baseline
  • random — random-signal baseline

Implementation: alphaswarm/backtest/vbtpro/engine.py::VectorbtProEngine. Full mode dispatch + Numba-jit constraints in vbtpro-integration.

Tier 2 — Per-bar Python loop

Two engines run a true Python on_bar callback. Use them when you need synchronous decisions inside the inner loop — agent dispatch, event-sourced LOB replay, custom callbacks vbt-pro can't represent.

  • EventDrivenBacktester — the only engine that exposes context['agents'] to strategies via AgentDispatcher, with TTL + LRU dedup of LLM calls.
  • LobBacktestEngine — hftbacktest-driven LOB tick replay; latency + queue models; market-making + execution strategies.

Tier 3 — Fallback cascade

FallbackBacktestEngine tries primary first, then walks fallbacks until one returns a BacktestResult. The OSS engines exist mainly as cascade fallbacks and for license-constrained deployments:

  • VectorbtEngine — OSS vectorbt; signals only (Apache-2.0).
  • BacktestingPyEngine — single-symbol with .optimize(...) grid + SAMBO (AGPL-3.0).
  • ZvtBacktestEngine — permissive-licence CN-bar fallback (MIT).
  • AatBacktestEngine — async / synthetic LOB fallback (Apache-2.0).

NautilusTrader is not wired in (LGPL-3.0; out of scope).

EngineCapabilities

Every engine declares its surface via EngineCapabilities on the class attribute. Agents introspect via the engine_capabilities tool; humans can call alphaswarm.backtest.engine_capabilities_index().

Pick by capability:

  • Vectorised research / parameter screens / WFO → VectorbtProEngine
  • Per-bar agent dispatch (LLM in the loop) → EventDrivenBacktester
  • LOB tick replay, latency + queue modelling → LobBacktestEngine
  • Synthetic LOB realism (OSS path) → AatBacktestEngine
  • Chinese-market data → ZvtBacktestEngine
  • Single-symbol grid optimisation → BacktestingPyEngine with .optimize(ranges, method="grid"|"sambo", ...)

When NOT to use the primary engine

The vbt-pro inner loop is Numba-jit compiled — signal_func_nb / order_func_nb cannot call Python objects per bar. Two patterns this rules out:

  1. Per-bar agent consults. Switch to EventDrivenBacktester and call context['agents'].consult(spec_name, inputs, ttl=...) from inside on_bar. The AgentDispatcher handles TTL + LRU dedup so the LLM gateway is not hammered.
  2. Per-bar custom Python that vbt-pro cannot express. If the inner loop needs a stateful Python object (custom risk model, bespoke order book heuristics), use event-driven.

If you can vectorise — or precompute a panel of decisions ahead of time — use vbt-pro AgenticVbtAlpha in precompute mode. The vectorbtpro mode dispatch lives in vbtpro-integration.

Dispatching from YAML

Three equivalent ways to pick an engine inside a strategy recipe:

# 1) Engine shortcut (cleanest).
backtest:
engine: vbt-pro:signals # or vbt-pro:orders / :optimizer / :holding / :random
kwargs:
initial_cash: 100000
fees: 0.0005

# 2) Explicit class + module.
backtest:
class: VectorbtProEngine
module_path: alphaswarm.backtest.vbtpro.engine
kwargs:
mode: orders
initial_cash: 100000

# 3) Fallback cascade.
backtest:
engine: fallback
primary: vbt-pro
fallbacks: [event, aat, zvt, vectorbt]
ShortcutResolves toNotes
default / event / event-drivenEventDrivenBacktesterBackward-compatible default.
primary / vbt-pro / vectorbt-proVectorbtProEngineTier 1.
vbt-pro:signals / :orders / :optimizer / :holding / :randomVectorbtProEngineMode injection.
vectorbt / vbtVectorbtEngineOSS fallback.
backtesting / btBacktestingPyEngineSingle-symbol.
zvtZvtBacktestEngineLazy import; CN bars.
aatAatBacktestEngineLazy import; async LOB.
hft / lobLobBacktestEngineTick replay.
fallback / cascadeFallbackBacktestEngineCascade with DEFAULT_FALLBACK_CHAIN = ("event", "aat", "zvt", "vectorbt").

alphaswarm.backtest.runner.run_backtest_from_config routes every YAML through the right engine and stamps engine into BacktestRun.metrics.

Agent + ML components

Strategies plug agents and ML models into either path:

  • Vectorised (vbt-pro) — panel components in alphaswarm/strategies/vbtpro/:
    • AgenticVbtAlpha — precompute or per-window agent dispatch into wide entries / exits / size arrays.
    • MLVbtAlpha — wraps any alphaswarm_models.base.Model (or MLflow URI) and emits arrays via threshold / top-k / rank policies.
    • AgenticOrderModel — drives Portfolio.from_orders from cached agent decisions.
  • Event-driven — context['agents'] exposes AgentDispatcher. See AgentAwareMomentumAlpha for a worked example.

For RL injection, every engine that declares EngineCapabilities.supports_rl_injection=True accepts the WeightCentricPipeline output through context['rl_agent'] (AGENTS rule 38).

Unified result shape

Every engine returns a BacktestResult with:

  • equity_curve: pd.Series indexed by timestamp.
  • trades: pd.DataFrame with timestamp, vt_symbol, side, quantity, price, commission, slippage, strategy_id.
  • orders: pd.DataFrame.
  • summary: dict — sharpe, sortino, max_drawdown, calmar, total_return, final_equity, n_bars, volatility_ann, n_trades, turnover, engine. Engine-specific keys live under vbt_*, bt_*, zvt_*, aat_*, hft_* so downstream code can light up native stats without re-running.

Hash-locked specs + audit ledger

Every dispatched backtest writes a row to backtest_runs with experiment_id (AGENTS rule 34) and a reference to the hash-locked StrategySpec version. The same spec hash returns the same *_spec_versions row on re-dispatch; content changes always create a new version. This makes every backtest replayable.

Gold-tier output lands at alphaswarm_gold_backtests.run_<run_id> via iceberg_catalog.append_arrow with medallion_layer="gold" (AGENTS rule 3, rule 21).

Worked example: dispatch + tearsheet

Goal: dispatch a backtest, tail its WebSocket frames, list the ledger row via DataMCP, render an equity curve in your browser.

Step 1 — dispatch

Step 2 — tail the WebSocket

curl -N http://localhost:8000/chat/stream/<task_id>

Frames arrive in the canonical {task_id, stage, message, timestamp, **extras} envelope (AGENTS rule 4). Expected stages: start → bar.processed (×N) → metrics.computed → done.

Step 3 — list via DataMCP

The data.backtests.list tool is the agent-safe alternative to a raw SELECT * FROM backtest_runs. From any MCP client:

curl -X POST http://localhost:8000/mcp/data/tools/data.backtests.list/invoke \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $(alphaswarm-cli auth token)" \
-d '{"limit": 5, "order_by": "started_at_desc"}'

Step 4 — equity curve in Pyodide

Render the equity curve client-side from inline sample points so the snippet stays self-contained. Replace with a fetch to /analytics/portfolio/<run_id>/equity-curve.json when running against the real platform.

Step 5 — verify

  • backtest_runs row with non-NULL sharpe, engine='VectorbtProEngine'.
  • WebSocket emitted a stage=done frame with the matching run_id.
  • alphaswarm_gold_backtests.run_<run_id> Iceberg table exists.
  • data.backtests.describe { run_id } MCP call returns the full row.

What next

Deeper reads

  • vbtpro-integration — vbt-pro mode dispatch, Numba constraints, hooks, walk-forward, Param sweeps, IndicatorFactory bridge.
  • hft-backtest — LOB engine, latency profiles, queue models, the five HFT strategies under alphaswarm/strategies/hft/.
  • strategy-lifecycle — draft → backtested → paper → live.
  • strategy-development — composer / simulation / ideation / single / batch / compare routes in the operator UI.
  • factor-research — building factor / alpha strategies.
  • ml-alpha-backtest — AlphaBacktestExperiment orchestrator + MLAlphaBacktestRun schema.
  • class-diagram — full engine class hierarchy + BacktestResult shape.
  • reference/api — the backtest tag (interactive playground).
  • reference/python — auto-generated reference for alphaswarm.backtest and alphaswarm.strategies.