alphaswarm-cp
The standalone control plane. Owns every workload-lifecycle action,
the unified identity broker, the connection-manager, and the
Phase 5 connection-proxy mesh. Does NOT import alphaswarm.* runtime
code — the boundary is enforced by
alphaswarm-control-plane.mdc.
Identity
| Field | Value |
|---|---|
| Service id | alphaswarm-cp |
| Role | control-plane |
| Package | alphaswarm_controller/ |
| Image (key) | cp |
| Built from | alphaswarm_controller/Dockerfile (multi-stage Wolfi + uv) |
Wire
| Field | Value |
|---|---|
| Protocol | HTTP/1.1 + HTTP/2 + WebSocket |
| Port | 9000 |
| Health | GET /manage/readyz (ready) / GET /manage/healthz (live) |
| Public URL | https://manage.alpha-swarm.ai (behind Cloudflare tunnel + Pomerium IAP) |
| Identity for incoming | per-route: /manage/* requires admin:cluster; /auth/* is unauthenticated up to /callback; /proxy/* requires the same scopes as the destination |
Surfaces
| Prefix | Purpose | Code |
|---|---|---|
/manage/* | Workload lifecycle (start/stop/scale/restart/exec/logs/apply_config/rotate_secret), credentials, terraform passthrough, topology, MFA, billing | alphaswarm_controller/api/routers/ |
/auth/m2m/token, /auth/agent-identity/token | Phase 1 identity broker — M2M + Entra Agent Identity tokens | api/routers/auth.py |
/auth/.well-known/openid-configuration | OIDC discovery (canonical location) | same |
/auth/login, /callback, /logout, /refresh, /me, /stepup, /device/start, /device/poll | Phase 3 BFF + RFC 8628 device flow | api/routers/bff.py |
/manage/connections, /manage/connections/{id} | Phase 2 connection manager — typed ConnectionDescriptor for any topology service | api/routers/connections.py + services/connections.py |
/proxy/{service_id}/{path} | Phase 5 connection-proxy mesh (SPIFFE-mediated mTLS in 5b) | api/routers/proxy.py |
Embedded operator
When the operator extra is installed, the same image hosts the
aqp-controller-operator — a kopf
process reconciling the eight AQP* CRDs. Single-replica
(Recreate strategy) so reconciliation order stays deterministic.
The bare alphaswarm-controller image keeps booting on
memory-constrained nodes that don't run the operator.
Deployment surfaces
| Surface | Where |
|---|---|
| Compose | service alphaswarm-cp in deployments/compose/docker-compose.admin.yml (admin overlay) |
| Kustomize | deployments/kubernetes/base/alphaswarm-cp/ — Deployment + Service + PDB |
| AQP operator (Phase 4) | deployments/kubernetes/aqp-controller-operator/ — kopf reconciler kustomize tree |
| Terraform module | alphaswarm_platform/terraform/modules/alphaswarm_workloads/ (workload), terraform_runner/ (paired pod) |
Dependencies
Upstream:
postgres—workload_runs,terraform_runs,EntraTenantLink, session store (Phase 5+).redis— kill-switch key, BFF session store, M2M token cache.- The cluster API (kubernetes / docker / aws / azure / gcp) through
per-provider adapters under
alphaswarm_controller/providers/.
Downstream:
alphaswarm-corecalls into/manage/*for cluster-internal lookups.alphaswarm-client,alphaswarm-ui,alphaswarm-admin,alphaswarm-cliuse/auth/*once theirAUTH_BFF_ENABLEDflag is on.alphaswarm-cli launchhits the operator route to render AQP* CRs.
Operations
- HA:
replicas: 2in prod; 1 in dev. PDBminAvailable=1. - Single operator: the kopf process is single-replica regardless of cp replicas — operator pods run as a separate Deployment.
- Step-up MFA: every
/manage/terraform/apply,/manage/credentials/cloud-cli/*, and/haltroute requires RFC 9470acr=high. - Audit: every
/manage/*action lands aworkload_runsrow; every/auth/*token mint lands asecurity_audit_eventsrow. Redaction is enforced byalphaswarm-management-engine. - Pomerium IAP: the public ingress wraps
/manage/*with Pomerium so the Entra-staff group is the only authenticated path.
See also
control-plane-topology.md— topology and URL fallback contract; cp is the sole topology server.terraform-control-plane.md—TerraformRuntimeruns inside cp.identity.md— IdentityProvider chain.alphaswarm_controller/AGENTS.md— hard rules for the standalone control plane.