ADR 006: alphaswarm_admin overhaul (multi-cloud control plane)
- Status: Proposed
- Date: 2026-05-25
- Supersedes: none (extends ADR 002 single-container client; the Solara legacy half is deprecated by this overhaul)
- Superseded by: none
Context
The alphaswarm_admin internal admin surface predates the overhaul:
- Backend was already a stateless FastAPI BFF brokering audit-first
to
alphaswarm_controllerand the AlphaSwarm monolith. - Frontend was a Vite + React Router SPA at
alphaswarm_admin/alphaswarm_admin_ui/. - Six modules from the blueprint were missing: secrets-manager, lineage-explorer, model-registry, paper-trading-control, rbac-admin, account-mode-switcher.
- Multi-account AWS topology was not provisioned.
- CI used
KUBECONFIG_*base64 secrets instead of GitHub Actions OIDC. - Only the bot fleet had ArgoCD; the main stack was kubectl-push.
- No S3 WORM mirror for
security_audit_events.
Decision
Frontend: migrate to Next.js 15 App Router.
Even though alphaswarm_client/ (the canonical Vite operator UI) and
alphaswarm_ui/ (the customer-facing PaaS) keep their existing
frameworks, the admin surface migrates to Next.js because:
- Server Components reduce the bundle on read-heavy admin pages.
- Server Actions remove API-route boilerplate for mutations.
- File-system routing maps cleanly onto the sidebar information architecture (one folder per module).
- Middleware-based auth with one-shot RFC 9470 step-up retries composes better than the Vite + React Router pattern.
The legacy alphaswarm_admin_ui/ stays deployable behind a feature flag
for a 30-day rollback window during the cutover. The new Next.js
app lives at alphaswarm_admin/frontend/.
Backend: extend, don't rewrite.
The existing six routers are kept. Six new module routers are
added under the established audit-first / M2M-broker /
require_admin_scope pattern. Step-up MFA per AGENTS rule 52 is
attached to every new mutating endpoint.
RBAC: stay on the existing 4-role lattice.
The blueprint suggested Casbin. We reject that — AlphaSwarm's canonical
RBAC is the
alphaswarm_core.auth.rbac 4-role lattice plus the existing
Membership table. Adding Casbin would create a parallel policy
source-of-truth that fragments rule 27. The new
/admin/rbac/* router builds on expand_role and the existing
require_scope / require_membership deps.
Multi-account AWS: code now, apply later.
A new top-level infrastructure/ directory ships the full module
library (landing-zone, account, vpc, eks-cluster, eks-node-groups,
karpenter-bootstrap, ecr-repositories, rds-postgres, s3-data-lake,
msk-kafka, airflow, eso-bootstrap, argocd-bootstrap,
observability-stack, iam-irsa-roles, route53-zones,
acm-certificates, acm-pca, github-oidc, codepipeline, codebuild,
codeartifact) plus per-environment compositions. Every
composition assumes-role into a workload account from
shared-services with external_id. Cloud-side terraform apply is deferred to operator hands; the PR ships the code.
CI/CD: GitHub OIDC + SLSA L3 + Cosign keyless.
.github/actions/{aws-oidc-assume,build-sign-push,slsa-provenance, kubectl-via-irsa} composite actions; new workflows
pr-validate.yml, build-publish.yml, argocd-trigger.yml,
terraform-pipeline.yml, ml-pipeline.yml,
paper-config-validate.yml, alembic-immutability.yml. Renovate
is wired with auto-merge to main only on minor + patch updates.
Observability + cost.
Linkerd (chosen over Istio Ambient + App Mesh because of the
~6x lower proxy memory and ~10x lower p99 latency overhead) is
the service mesh; Falco + Velero + Kubecost ship as Helm-chart
wrappers. Karpenter v1 self-managed (NOT EKS Auto Mode) so the
NodePool specs are recorded under
terraform_stack_spec_versions.
Audit WORM.
alphaswarm/tasks/audit_log_export_tasks.py::export_audit_log_window
exports security_audit_events + audit_log nightly to
s3://alphaswarm-audit-archive-${ACCOUNT_ID}/ with
ObjectLockMode=COMPLIANCE + 7-year retention per FINRA Rule
4511 + SEC Rule 17a-4(f)(2)(i)(B).
IdP support.
Two new IdentityProvider subclasses ship under
alphaswarm/auth/providers/:
aws_iam_identity_center.pyaws_cognito.py
Both subclass GenericOidcProvider and auto-register through
IdentityProviderMeta. IAM Identity Center is the recommended IdP
for multi-account; Cognito is the documented fallback for the
single-account path.
Consequences
- The 6 missing modules ship with full audit-first wiring + step-up MFA + WS multiplexing.
- Frontend bundles get smaller; SSR'd admin pages enable better caching.
- Multi-account topology is one
terraform applyaway. - CI gains SLSA L3 attestations + Cosign keyless verification.
- Audit ledger is FINRA-compliant via WORM mirroring.
- The
alphaswarm_admin_ui/Vite tree adds maintenance debt for the duration of the rollback window. Cleanup PR scheduled after 30-day burn-in. - The legacy
alphaswarm/ui/Solara dashboard remains in place; a separatealphaswarm_admin-overhaul-cleanupPR handles its removal + the FastAPI/Starlette unpin.