Agentic pipeline

Doc map: intro Â· Sequence diagrams: flows Â· Spec-pattern primer: agentic-development Â· Multi-agent topologies: multi-agent-patterns Â· Orchestration adapters: workflow-studio Â· Worked tutorial: tutorials/first-agent-workflow.

This page walks through the AlphaSwarm agentic-trading lifecycle: pick a model, register a data source, snapshot the spec, dispatch through the workflow runtime, and review the run. Every action has a REST + CLI surface so you can script the same flow; every action also has an alphaswarm_client (Vite UI) route at alpha-swarm.ai so a human can drive it.

The pipeline is five stages. The new stage since the prior version of this doc is Spec snapshot â€” every spec-driven run now hash-locks into an immutable *_spec_versions row before any work happens.

1 â€” Models and providers

Open /models in the operator UI (alphaswarm_client). The page lives at alphaswarm_client/src/routes/models/ and exposes three tabs:

Ollama (host) â€” type a model tag in Pull a model (e.g. nemotron, llama3.2, qwen2:7b) and click Pull. A Celery task streams progress over the canonical /chat/stream/{task_id} envelope so the page shows a real-time download bar.
vLLM â€” every YAML under configs/llm/ becomes a profile card showing compose status, served models, and Start / Stop buttons. Starting a profile auto-saves its base_url as the active vLLM endpoint.
SERA-32B â€” opt-in Ai2 Open Coding model for the codebase MCP elaborator (see sera). Configure ALPHASWARM_SERA_ENABLED=true + ALPHASWARM_SERA_ENDPOINT in your env.

Every model call routes through router_complete (AGENTS rule 2). Provider selection is declared in AgentSpec.model; the runtime drives the call â€” never call router_complete directly from inside an agent body (AGENTS rule 12).

REST equivalents (each returns TaskAccepted for streaming endpoints):

curl -X POST localhost:8000/agentic/models/pull \
    -H 'content-type: application/json' \
    -d '{"name":"llama3.2"}'

curl -X DELETE localhost:8000/agentic/models/llama3.2
curl -X GET    localhost:8000/agentic/models/running
curl -X GET    localhost:8000/agentic/vllm/profiles
curl -X POST   localhost:8000/agentic/vllm/start \
    -H 'content-type: application/json' \
    -d '{"profile":"vllm_nemotron"}'

2 â€” Data sources

Open /data/hub in the operator UI. This is the active replacement for the legacy Solara explorer pages.

The Hub exposes the four data-plane tiers (see data-plane):

Discovery browser â€” unified ingested / pending / orphan / external_only entries; filter chips drive the DiscoveryService.
Iceberg Editor â€” namespace browser + parquet preview + column profiling.
Airbyte builder â€” schema-driven connector editor at alphaswarm_client/src/components/airbyte/builder/. Emits either Airbyte YAML or an AlphaSwarm-native Fetcher stub. No free-text credential fields â€” every secret resolves through <EntityPicker kind="credentials" /> (AGENTS rule 31).
Dagster sandbox â€” ephemeral per-session Dagster + Airbyte environment (AGENTS rule 32).

REST surface:

curl -X GET  http://localhost:8000/discovery/entries
curl -X POST http://localhost:8000/sources/alpha_vantage/probe
curl -X POST http://localhost:8000/discovery/entries/<id>/promote
curl -X POST http://localhost:8000/dagster/sandbox/sessions

Or invoke the data MCP tools directly:

curl -X POST http://localhost:8000/mcp/data/tools/data.discovery.browse/invoke \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $(alphaswarm-cli auth token)" \
    -d '{"namespace_prefix":"alphaswarm_silver_yfinance"}'

3 â€” Spec snapshot

Every spec-driven run hash-locks the spec into a *_spec_versions row before any work happens. The same content always returns the same version_id; any field change creates a new row; old rows stay forever for replay. This is the invariant that makes the entire agentic pipeline auditable.

Five hash-locked spec types ship today:

Spec	Runtime	Versions table	AGENTS rule
`AgentSpec`	`AgentRuntime`	`agent_spec_versions`	12-13
`BotSpec`	`BotRuntime`	`bot_versions`	14-15
`RLExperimentSpec`	`RLRuntime`	`rl_experiment_versions`	16-17
`AnalysisSpec`	`AnalysisRuntime`	`analysis_spec_versions`	23-24
`WorkflowSpec`	`WorkflowRuntime`	`workflow_spec_versions`	40-41

Plus two additive ones from the management engine:

Spec	Runtime	Versions table	AGENTS rule
`TerraformStackSpec`	`TerraformRuntime`	`terraform_stack_spec_versions`	42-43
(workload ops)	`WorkloadRuntime`	`workload_runs` (write-only ledger)	45

REST:

# AgentSpec
curl -X POST http://localhost:8000/agents/specs \
    -H "Content-Type: application/json" \
    -d @configs/agents/research_lite.yaml

# WorkflowSpec
curl -X POST http://localhost:8000/workflows/specs \
    -H "Content-Type: application/json" \
    -d @configs/workflows/my-research-loop.yaml

4 â€” Workflow dispatch

WorkflowRuntime is the additive control plane that composes every spec runtime into multi-node DAGs. It ships with seven OrchestrationAdapter kinds (AGENTS rule 40):

graph â€” LangGraph state machine
crew â€” CrewAI manager-pattern crew
debate â€” bounded debate with N participants
fusion â€” fan-out / fan-in
execution â€” wraps an RLRuntime / BotRuntime / AnalysisRuntime as a single node
schedule â€” Cron-triggered, idempotent
studio â€” Operator-driven UI wiring at /workflows

Dispatch:

curl -X POST http://localhost:8000/workflows/<name>/run \
    -H "Content-Type: application/json" \
    -d '{"inputs": {...}}'

The runtime:

Re-hash-locks every referenced spec (idempotent).
Opens a workflow_runs row with status=pending.
Builds the adapter DAG.
Walks nodes; for each, opens an agent_runs_v2 row and delegates to the relevant runtime.
Emits canonical progress frames at every transition.
Calls should_halt() before every step â€” the topbar KillSwitch reaches every node within ~250ms.
Enforces cost_caps (per_node_max_tokens, per_run_max_usd) per AGENTS rule 12.

Replay:

curl -X POST http://localhost:8000/workflows/runs/<run_id>/replay

Replay reuses the same workflow_spec_versions row + every referenced *_spec_versions row; a new workflow_runs row lands with a parent_run_id pointer.

5 â€” Review

Three review surfaces, each consuming the same canonical ledger:

WebSocket stream

The frame envelope is {task_id, stage, message, timestamp, **extras} per AGENTS rule 4. Subscribe from any client:

const ws = new WebSocket(`ws://localhost:8000/chat/stream/${task_id}`);
ws.onmessage = (e) => {
  const f = JSON.parse(e.data);
  console.log(f.stage, f.message, f.extras);
};

`agent_runs_v2` + `workflow_runs` ledger

Agent-safe reads via DataMCP:

curl -X POST http://localhost:8000/mcp/data/tools/data.workflows.describe/invoke \
    -H "Content-Type: application/json" \
    -d '{"workflow_run_id": "<id>"}'

curl -X POST http://localhost:8000/mcp/data/tools/data.agents.list_runs/invoke \
    -H "Content-Type: application/json" \
    -d '{"workflow_run_id": "<id>", "limit": 20}'

Each row carries experiment_id + test_id (AGENTS rule 34), total_tokens, total_cost_usd, and a full per-step breakdown under agent_run_steps.

Inkeep AI assistant + docs MCP server

Two new surfaces in 2026-05:

Inkeep widget in-product. The "Ask AI" button in alphaswarm_client routes to an Inkeep agent that has the entire docs corpus + every public AlphaSwarm API spec ingested. It cites by URL and never invents references.
Docs MCP server at docs.alpha-swarm.ai/mcp. An RFC 9728 + 8707 compliant Cloudflare Worker (AGENTS rule 49). Cursor / Claude / Continue / custom scripts connect to it for search, fetch_page, and list_pages over the same corpus. In-platform agents reach it through the bridged data.docs.* MCP tools.

Both surfaces compose with the workflow runtime: a workflow node can call Inkeep / the docs MCP server as an external tool, and the agent_runs_v2 row records the call.

Worked example: build a research workflow

Goal: snapshot an AgentSpec + WorkflowSpec, dispatch the workflow, tail progress, inspect the ledger â€” all from this page.

Step 1 â€” snapshot an `AgentSpec`

Re-running with identical content returns the same (spec_id, version_id) â€” the runtime treats it as a no-op.

Step 2 â€” snapshot a `WorkflowSpec` that references it

Step 3 â€” dispatch

Step 4 â€” tail progress

curl -N http://localhost:8000/chat/stream/<task_id>

You will see frames in the canonical envelope. Expected stages: workflow.started â†’ node.research.started â†’ agent.token (Ã—N) â†’ node.research.completed â†’ workflow.completed.

Step 5 â€” inspect the ledger

Demonstrate the analysis pattern with a small inline sample of what the MCP describe call returns:

Step 6 â€” verify

agent_spec_versions row exists with the recorded spec_hash.
workflow_spec_versions row exists; its content references the agent_spec_versions row from Step 1.
One workflow_runs row + one agent_runs_v2 row (one node).
total_cost_usd is under the workflow's per_run_max_usd cap.
Re-dispatching by triggering Step 3 again creates a NEW workflow_runs row but reuses ALL the same *_spec_versions rows.

What next

Walk the full tutorial: tutorials/first-agent-workflow.
Add a second node: concepts/agentic/workflow-studio â€” the seven adapter kinds.
Read the topology catalogue: concepts/agentic/multi-agent-patterns.
Snapshot an agent spec from the CLI: how-to/recipes/snapshot-an-agent-spec.

The four-runtime story

This pipeline is one of four overlapping execution surfaces. Each has its own concept doc but they all share the same hash-lock invariant, the same canonical progress frame, the same kill-switch fan-out, and the same experiment_id audit chain.

Runtime	Lifecycle surface	Worked tutorial	Concept doc
`AgentRuntime`	Single agent, single spec	(covered here)	agents
`BotRuntime`	Bot = universe + strategy + ML + agents + RAG + risk	tutorials/first-bot	bots
`RLRuntime`	Train / evaluate / paper / replay / walk-forward	tutorials/first-rl-experiment	concepts/rl/rl-framework
`WorkflowRuntime`	Composition layer over the other three	tutorials/first-agent-workflow	workflow-studio

Hard rules (agentic-pipeline scope)

The full set is in AGENTS.md. The agentic-pipeline subset:

Rules 12-13 â€” All spec-driven agent runs go through AgentRuntime; agent_spec_versions rows are immutable.
Rule 22 â€” Agents never read Postgres / Iceberg directly; every read through a DataMCPTool.
Rule 40 â€” All workflow lifecycle actions go through WorkflowRuntime.
Rule 41 â€” workflow_spec_versions rows are immutable hash-locked snapshots.
Rule 34 â€” Every run-producing flow populates experiment_id.
Rule 49 â€” Every MCP server is RFC 9728 + 8707 conformant.
Rule 54 â€” Delegated agent tokens for HTTP MCP calls go through TokenExchangeBroker (RFC 8693 + Auth0 Custom Token Exchange Profile alphaswarm-agent-delegation).

Deeper reads

agentic-development â€” AlphaSwarm's spec-pattern mapped to the broader agentic-coder vocabulary.
agents â€” AgentSpec schema + AgentRuntime lifecycle.
multi-agent-patterns â€” sequential / parallel / debate / coordinator / ReAct topologies.
workflow-studio â€” the additive WorkflowRuntime + seven adapter kinds.
orchestration-refactor-rollout â€” operator rollout / rollback runbook.
alpha-researcher-agent, research-agents, selection-agents, trader-agents, analysis-agents â€” domain agent suites.
bots â€” bot entity (TradingBot / ResearchBot) and BotRuntime.
agent-watchdog â€” Celery beat task that halts stalled agent_runs_v2 rows.
reference/api â€” the agents + workflows tags (interactive playground).
reference/python/alphaswarm/agents â€” auto-generated Python reference.

1 â€” Models and providers​

2 â€” Data sources​

3 â€” Spec snapshot​

4 â€” Workflow dispatch​

5 â€” Review​

WebSocket stream​

agent_runs_v2 + workflow_runs ledger​

Inkeep AI assistant + docs MCP server​

Worked example: build a research workflow​

Step 1 â€” snapshot an AgentSpec​

Step 2 â€” snapshot a WorkflowSpec that references it​

Step 3 â€” dispatch​

Step 4 â€” tail progress​

Step 5 â€” inspect the ledger​

Step 6 â€” verify​

What next​

The four-runtime story​

Hard rules (agentic-pipeline scope)​

Deeper reads​