Agentic development for AlphaSwarm

The single doc that connects AlphaSwarm's existing primitives to the broader "agentic-coder" vocabulary, plus the consolidated security manifesto. Doc map: alphaswarm_docs/index.md · Workflow: ../WORKFLOW.md · Hard rules: ../AGENTS.md.

What this doc is for

The agentic-coder research literature talks about "skill artifacts", "skill graphs", "Memento-skills", "auditable execution trails", and "MCP control planes" as if they were novel patterns to invent. AlphaSwarm already implements every one of them — under different names, with stronger invariants, and with ledger-backed audit chains. This doc makes that mapping explicit so you don't waste time inventing a parallel "skill" surface alongside the current spec runtimes that already exist.

The doc has three sections:

AlphaSwarm's spec-pattern is the skill-artifact pattern. The spec-runtime architecture (Agent / Bot / RL / Analysis / Workflow / Terraform) is the skill-graph + Memento-skill equivalent. Including where AlphaSwarm deliberately diverges from research recommendations.
Working with Cursor agents in AlphaSwarm. Static channel + dynamic channel + plan-mode vs agent-mode usage.
The ADLC security manifesto. Consolidated for the first time.

1. The spec-pattern is the skill-artifact pattern

The five spec runtimes

Spec	Runtime	Versions table	Canonical doc
`AgentSpec`	`AgentRuntime`	`agent_spec_versions`	agents.md
`BotSpec`	`BotRuntime`	`bot_versions`	bots.md
`RLExperimentSpec`	`RLRuntime`	`rl_experiment_versions`	rl-framework.md
`AnalysisSpec`	`AnalysisRuntime`	`analysis_spec_versions`	analysis-framework.md
`WorkflowSpec`	`WorkflowRuntime`	`workflow_spec_versions`	workflow-studio.md

WorkflowSpec (Phase 5 of the additive orchestration refactor) sits above the four classic runtimes: it composes them through the OrchestrationAdapter registry. A workflow can wrap an existing AgentRuntime invocation (via the LangGraphAdapter / CrewProcessAdapter / DialecticalDebateAdapter) or chain deterministic fusion + risk-overlay execution (via SignalFusionAdapter + WeightCentricExecutionAdapter). All five runtimes share the same hash-locked + immutable + ledger-backed semantics described below.

Each is:

Declarative — a Pydantic model with strict types.
Hash-locked — the SHA-256 of the canonical-JSON-serialized spec is the version key.
Auto-versioned — first run snapshots a row in the *_versions table; behaviour changes produce new rows; old rows are immutable.
Ledger-backed — every run records spec_version_id so the exact run can be deterministically replayed against historical data.
Discoverable — the registry pattern (built-ins + YAML auto-loading) means new specs come online without touching the runtime.

Mapping to research vocabulary

The agentic-coder literature 2024–2026 used several overlapping terms. Here's how each lands on AlphaSwarm's primitives:

Research term	AlphaSwarm equivalent	Notes
"Skill artifact"	One row in a `*_versions` table	The artifact has semantic interface (the Pydantic spec), preconditions (the spec's input schema), executable payload (the runtime invocation), and deterministic postconditions (the run row + Iceberg outputs).
"Skill graph"	The full registry across the active spec runtimes	Each runtime hosts one graph; `BotSpec` references `AgentSpec`s, `RLExperimentSpec` references data pipelines, `AnalysisSpec` references flows, and orchestration/deployment specs compose the runtime graph at higher levels.
"Auditable execution trail"	`*_runs` ledger rows + Iceberg outputs + per-step result tables	E.g. `analysis_runs` + `analysis_step_results` + `alphaswarm_gold_analysis_<flow.namespace>`
"MCP control plane"	The DataMCPTool catalog	One catalog, two transports (in-process bridge + FastAPI router + stdio binary). See data-mcp.md.
"Memento-skill / continual learning"	Re-snapshot on change	When a spec changes, `persist_spec` inserts a new version row — old versions stay for replay. The "memory" is the immutable history.
"Verifiable rewards"	The `*_runs` ledger + cost caps + guardrails on the runtime	Telemetry covers cost, latency, and outcome metrics.

Where AlphaSwarm deliberately diverges

The research recommends some patterns that AlphaSwarm rejects on purpose:

"Rewrite the skill on failure" / self-modifying skills. The research literature (e.g. new framework lets AI agents rewrite their own skills without retraining) advocates patching a failing skill in-place. AlphaSwarm forbids this. Reasons:
- Auditability — every behaviour change must be a new hash-locked version row, not an in-place mutation.
- Replay — runs reference spec_version_id for replay; mutating the spec breaks the replay invariant.
- Compliance — financial systems need an append-only audit trail.
- Risk — a self-mutating spec next to live capital is a non-starter. The right pattern in AlphaSwarm: when a spec fails, author a new spec version (manually or via tooling), snapshot it, switch traffic. The previous version remains for forensics.
"Skill graph self-improvement loops" that mutate skill metadata across runs. AlphaSwarm's metadata is owned by the active metadata layer (alphaswarm.data.catalog.register_dataset) and updated through explicit upserts — never as a side effect of a run.
"Free-form SQL tools for agents" to "let the model figure it out". AlphaSwarm requires every read to go through a registered DataMCPTool with a strict args schema and policy check. See data-mcp.md and the data-mcp.mdc Cursor rule.
"Auto-update implementation when intent changes" (intent-driven development with bidirectional updates). AlphaSwarm's docs are updated in the same PR that touches the code, by humans or under explicit human review. Drift detection is welcome; automatic mutation is not.

Adding a new spec — the canonical flow

Pick the right runtime by the question being answered:
- "What should an LLM-driven agent do?" → AgentSpec
- "What should a deployable bot (universe + strategy + risk + ML + agents + RAG) do?" → BotSpec
- "What should an RL experiment train / evaluate?" → RLExperimentSpec
- "What statistical / numerical analysis flow should run on a dataset?" → AnalysisSpec
Author the YAML or programmatic Pydantic instance.
Call the right persist_spec(...) (or let the registry do it on first lookup).
Run via the runtime — the first run snapshots a *_versions row.
The run row records spec_version_id and emits progress through alphaswarm/tasks/_progress.py.

If you find yourself wanting to "add a new skill artifact" outside this pattern — stop, read this section again, pick the right spec runtime.

2. Working with Cursor agents in AlphaSwarm

The two-channel context strategy

AlphaSwarm follows the static / dynamic context bifurcation pattern that Anthropic's Cursor integration recommends:

Static channel — what doesn't change between sessions:
- AGENTS.md — 45 hard rules
- .cursor/rules/ — glob-scoped rule files
- alphaswarm_docs/ — narrative architecture
Dynamic channel — what changes session-to-session:
- DataMCPTool catalog (live database schemas, dataset lineage, entity catalog)
- The agent_runs_v2 / bot_deployments / rl_runs / analysis_runs ledger rows
- The Cursor environment's recently-edited / open files / terminal state

The Cursor agent should treat the static channel as authoritative for rules and architecture, and the dynamic channel as authoritative for live state (don't guess a table schema — query the MCP catalog).

Plan mode vs agent mode

Mode	When	Restrictions
Plan mode	Complex / ambiguous tasks, architectural decisions, large refactors, anything with > 1 valid implementation	Read-only — cannot edit files
Agent mode	Single clear task, post-plan implementation, debugging once root cause is known	Full tool access
Background mode	Long-running tasks (Docker stack rebuild, full test suite, training runs)	Runs in parallel; non-blocking
Ask mode	"How does X work?" / read-only exploration	Cannot edit; can search

The ../WORKFLOW.md document has the full plan→act→reflect cadence including FAST vs SLOW velocity calibration and intervention nodes.

Reading the agent's plan output as a structured spec

When Cursor's plan mode produces a .cursor/plans/*.plan.md file, treat it like a *Spec artifact: the human reviews, approves, and the agent then executes the plan one task at a time, updating todos as it goes. The plan file is the contract.

3. ADLC security manifesto

The Agentic Development Life Cycle (ADLC) framing says: as agentic autonomy expands, the security posture must scale with it. AlphaSwarm already enforces several layers; this section consolidates them in one place so you can audit the surface in one read.

Layer 1 — Kill-switch (ultimate human override)

Code: alphaswarm/risk/kill_switch.py, alphaswarm/risk/manager.py
Wired endpoint today: POST /portfolio/kill_switch in alphaswarm/api/routes/portfolio.py
Frontend topbar component: alphaswarm_client/src/components/common/KillSwitch.tsx
Design contract for per-runtime fan-out — /agents/halt, /paper/stop-all, /bots/halt-all, /rl/halt-all — see frontend.mdc (wire as the endpoints come online; add them to KillSwitch in the same PR).
All paper sessions halt within one heartbeat and cancel open orders. The Meta-Agent can flip the switch; an operator can flip it; the agent is never allowed to flip it without explicit human acknowledgement (per WORKFLOW.md#intervention-nodes).

Layer 2 — Immutable spec versions (audit trail)

agent_spec_versions, bot_versions, rl_experiment_versions, analysis_spec_versions are append-only.
Each spec is hash-locked (SHA-256 of canonical JSON).
Every run records spec_version_id for replay.
This guarantees: every behaviour change has a permanent record identifying who introduced it (via the commit) and what the spec looked like at that moment.

Layer 3 — DataMCPTool boundary (no direct catalog reads)

Agents MUST NOT import alphaswarm.persistence.models... or call iceberg_catalog / duckdb_provider directly inside their body.
All reads go through registered DataMCPTools, exposed via in-process bridge + FastAPI /mcp/data router + alphaswarm-data-mcp stdio binary.
See data-mcp.md and data-mcp.mdc.

Layer 4 — Single LLM entry-point (router_complete)

All LLM calls go through router_complete.
No direct litellm.completion / OllamaClient / vendor SDKs.
The router enforces tier policies, cost caps, and provider fallback. Bypassing it strips those guardrails.

Layer 5 — Single Iceberg entry-point + medallion enforcement

All writes go through iceberg_catalog.append_arrow / create_or_replace_table.
The wrapper validates that the namespace prefix matches the declared medallion_layer (bronze / silver / gold).
BusinessMetadata is mandatory on first write — agents query this surface to know what a dataset is for.
See data-layer-unification.md and iceberg.mdc.

Layer 6 — Secrets and configuration

Configuration through alphaswarm.config.settings only — never construct a fresh Settings(), never read os.environ directly.
New env vars are ALPHASWARM_*-prefixed fields on the Settings class in alphaswarm/config/settings.py and added to .env.example.
Credentials use the helpers in alphaswarm/utils/keys.py; never paste them into .env outside what's already in .env.example.

Layer 7 — Migration immutability

See migrations-persistence.mdc.
Shipped migrations are never edited. Schema bugs are fixed forward, never backward.

Layer 8 — Pre-merge checklist (human-driven)

The checklist in CONTRIBUTING.md is the last line of defence:

Tests pass locally
Docs updated (data-dictionary, ERD, glossary)
New env vars in .env.example
New deps in pyproject.toml
Migration applied + reviewed (autogenerate footguns checked)
For SLOW-mode work: TDD-loop followed (see WORKFLOW.md)

Recommended (not yet enforced) — red-team review

For any new AgentSpec that gains broker-API or live-trading tools, run a red-team review before promoting from paper to live:

Adversarial prompt simulation
Boundary-violation tests (does the agent try to escape its tool catalog?)
Cost-cap stress (does it loop?)
Margin / risk-limit interaction (does the spec respect alphaswarm/risk/ constraints?)

Today this is documentation, not automation. Future work: a POST /agents/red-team-review task that takes an AgentSpec and runs a fixed adversarial battery against it before promotion.

When in doubt

Read ../AGENTS.md — the canonical 45 rules.
Read ../WORKFLOW.md — the cadence.
Read multi-agent-patterns.md — when you're scaling the agent topology.
Read glossary.md — for terminology.
Search the code: rg "<symbol>" alphaswarm/.

What this doc is for​

1. The spec-pattern is the skill-artifact pattern​

The five spec runtimes​

Mapping to research vocabulary​

Where AlphaSwarm deliberately diverges​

Adding a new spec — the canonical flow​

2. Working with Cursor agents in AlphaSwarm​

The two-channel context strategy​

Plan mode vs agent mode​

Reading the agent's plan output as a structured spec​

3. ADLC security manifesto​

Layer 1 — Kill-switch (ultimate human override)​

Layer 2 — Immutable spec versions (audit trail)​

Layer 3 — DataMCPTool boundary (no direct catalog reads)​

Layer 4 — Single LLM entry-point (router_complete)​

Layer 5 — Single Iceberg entry-point + medallion enforcement​

Layer 6 — Secrets and configuration​

Layer 7 — Migration immutability​

Layer 8 — Pre-merge checklist (human-driven)​

Recommended (not yet enforced) — red-team review​

When in doubt​