CI/CD pipelines

The AlphaSwarm AWS deployment is driven by CI/CD: GitHub Actions orchestrates the pipeline and AWS CodeBuild runs the heavy in-VPC work (multi-arch buildx builds to ECR, and the alphaswarm deploy app-tier apply). There are no static AWS keys anywhere in the pipeline — every cloud step authenticates through GitHub OIDC.

This page explains the topology, the trust model, and the workflows. For the task-oriented steps (creating environments, triggering a deploy, approving a prod release, rolling back) see the companion runbook Operations runbook — CI/CD deploy. For the deeper deploy walkthroughs see AWS Hybrid Deployment Guide and AWS Hybrid Operational Runbook.

Topology — GitHub Actions, CodeBuild, OIDC

GitHub Actions is the control plane: it reacts to pushes, tags, pull requests, and repository_dispatch, then either runs lightweight Terraform directly or delegates the in-VPC heavy lifting to CodeBuild via aws codebuild start-build. The GitHub Actions job first assumes an AWS role over OIDC, so the start-build call (and everything CodeBuild does downstream) runs under short-lived credentials.

Why split the work this way:

GitHub Actions is cheap, parallel, and is where the promotion gates (GitHub Environments + required reviewers) live.
CodeBuild runs inside the workload VPC, so it can reach private subnets, the internal CodeArtifact PyPI, and the app-tier resources that alphaswarm deploy manages. It also gives multi-arch buildx a beefy, in-account builder close to ECR.

Authentication — GitHub OIDC, no static keys

Trust is configured per account via the infrastructure/modules/github-oidc module, which registers the GitHub OIDC provider and the IAM roles. The provider trusts both deploying repos:

Alpha-Swarm-ai/alphaswarm_platform
Alpha-Swarm-ai/alphaswarm_admin

Plan role vs apply role

The module emits two roles per account, with different trust conditions on the OIDC sub claim:

Plan role — read-only. Trusted on pull-request refs so that PR validation can run terraform plan / validate without any mutate permission. Example trusted subjects:
```
repo:Alpha-Swarm-ai/alphaswarm_platform:pull_request
repo:Alpha-Swarm-ai/alphaswarm_platform:ref:refs/heads/main
```
Apply role — read-write. Trusted only on refs/heads/main and scoped to a GitHub Environment, so an apply cannot run until the Environment's required reviewers approve. Example trusted subjects:
```
repo:Alpha-Swarm-ai/alphaswarm_platform:ref:refs/heads/main
repo:Alpha-Swarm-ai/alphaswarm_platform:environment:prod
```

The apply role ARN is published per environment as the AWS_DEPLOYER_ROLE_ARN repo variable (one value per GitHub Environment); the plan role ARN is published alongside it. A workflow job selects the role for its target env, then assumes it over OIDC.

permissions:
  id-token: write   # required to mint the GitHub OIDC token
  contents: read

jobs:
  apply:
    environment: prod   # gates on the Environment's required reviewers
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ vars.AWS_DEPLOYER_ROLE_ARN }}
          aws-region: us-east-1

Hybrid Terraform boundary

There are two Terraform trees and they are applied two different ways. The boundary is deliberate.

Tree	What it owns	Applied by	Auth	Audit
`infrastructure/`	Landing zone: VPC, `ECR`, RDS, EKS, OIDC provider, observability, the `CodeBuild`/`CodeArtifact` plumbing	Native `terraform plan` / `terraform apply`	OIDC into `AqpTerraformExecutionRole`	Terraform state only
`terraform/`	App tier: the per-env application composition deployed onto the platform	`alphaswarm deploy plan` / `alphaswarm deploy up` (`TerraformRuntime`)	`TerraformRuntime` in `CodeBuild`	Writes a `terraform_runs` audit row

The app tree is never applied with a bare terraform apply. It goes through alphaswarm deploy, which drives TerraformRuntime and writes a terraform_runs audit row for every plan and apply (platform AGENTS rule 42). That keeps the app-tier change history in the same ledger as every other runtime action. See Terraform IaC control plane for how TerraformRuntime works and IaC runbook for the provisioning recipes.

# Landing zone (infrastructure/): native terraform, OIDC -> AqpTerraformExecutionRole
terraform -chdir=infrastructure/envs/dev init
terraform -chdir=infrastructure/envs/dev plan

# App tier (terraform/): alphaswarm deploy, writes a terraform_runs row
alphaswarm deploy plan --env dev
alphaswarm deploy up   --env dev

CodeArtifact for alphaswarm-core and the CLI

alphaswarm-core and the alphaswarm CLI are not installed from public PyPI in CI or in the Docker images. They are pulled from the platform's AWS CodeArtifact internal PyPI repository, alphaswarm-pypi. CI (and every Dockerfile build step that needs the CLI) authenticates to CodeArtifact over the same OIDC-derived credentials and configures it as the pip index:

aws codeartifact login --tool pip \
  --domain alphaswarm --repository alphaswarm-pypi
pip install alphaswarm-core "alphaswarm[deploy]"

This keeps the internal packages private and gives CI a stable, in-account index that does not depend on public PyPI availability.

The three canonical workflows

These names match compliance/soc2-evidence-map.md, how-to/operations/aws-deploy.md, how-to/operations/aws-runbook.md, and ADR 006 — alphaswarm_admin overhaul.

terraform-pipeline.yml

The deploy workflow for both Terraform trees.

Inputs: tree ∈ alphaswarm_platform, env ∈ prod, action ∈ apply.
push to main: runs a plan against dev automatically.
Dispatch (apply): assumes the env's apply role and applies the selected tree. For tree=infrastructure it runs native terraform apply; for tree=alphaswarm_platform it delegates to CodeBuild, which runs alphaswarm deploy up (and lands the terraform_runs row).

build-publish.yml

The image release workflow. Triggers on a v* tag and, for each service, performs a supply-chain-hardened build:

multi-arch buildx build, pushed to ECR;
Cosign keyless signature (OIDC, no long-lived keys);
syft SBOM generation;
SLSA provenance attestation;
Trivy and Grype vulnerability scans.

The per-service build/sign/push logic is factored into the composite action .github/actions/build-sign-push/, so every service builds identically.

pr-validate.yml

The pull-request gate. On every PR it runs terraform fmt -check, terraform validate, tfsec, and conftest (OPA) policy checks, then a terraform plan using the plan role (read-only). It never holds mutate permission, so a PR can be validated safely from a fork or feature branch.

Promotion — dev to staging to prod

Promotion is enforced by GitHub Environments with required reviewers, layered on top of the OIDC apply-role trust (the apply role is only assumable inside the matching Environment):

Environment	Approval	Trigger
`dev`	Auto (no reviewers)	`push` to `main` plans `dev`; apply on dispatch
`staging`	1 reviewer	Dispatch `terraform-pipeline.yml` with `env=staging`
`prod`	2 reviewers (4-eyes)	Dispatch `terraform-pipeline.yml` with `env=prod`

Because the gate lives in the GitHub Environment, a prod apply physically cannot start minting the apply-role credential until two distinct reviewers approve the run.

alphaswarm_admin — two images, then a dispatch handoff

alphaswarm_admin is built and deployed slightly differently from the platform itself.

A push to the admin repo's main (or a v* tag) builds two images and pushes them to ECR:
- alphaswarm-admin (the FastAPI backend)
- alphaswarm-admin-frontend (the Next.js frontend)
After both images land, the admin workflow fires a cross-repo repository_dispatch event named admin-image-published at alphaswarm_platform.
That dispatch triggers the platform's app-tier redeploy, which rolls the admin service onto ECS Fargate (Cognito + ALB) via the platform's terraform/environments/{dev,staging,prod} app tier (generalized from the existing minimum env).
The app tier reads its infra handles from SSM under /alphaswarm/<env>/*, published by infrastructure/envs/admin-{dev,staging,prod}.

The cross-repo dispatch requires a token (PLATFORM_DISPATCH_TOKEN) configured as a secret in the admin repo — see the runbook for setup. For what the admin service itself is, see alphaswarm-admin.

Topology — GitHub Actions, CodeBuild, OIDC​

Authentication — GitHub OIDC, no static keys​

Plan role vs apply role​

Hybrid Terraform boundary​

CodeArtifact for alphaswarm-core and the CLI​

The three canonical workflows​

terraform-pipeline.yml​

build-publish.yml​

pr-validate.yml​

Promotion — dev to staging to prod​

alphaswarm_admin — two images, then a dispatch handoff​

See also​