Skip to main content

Silo-per-tenant IaC

Section G of the alphaswarm_kb blueprint implemented as a Terragrunt tree under alphaswarm_platform/terragrunt/.

Layout

alphaswarm_platform/
├── terraform/modules/
│ ├── tenant_kb_silo/ # canonical wrapper (dispatches by var.cloud)
│ ├── tenant_kb_silo_aws/ # AWS: ECS Fargate + RDS + S3 + KMS
│ ├── tenant_kb_silo_azure/ # Azure: ACA + Flex Postgres + Blob + Key Vault
│ ├── tenant_kb_silo_gcp/ # GCP: Cloud Run + Cloud SQL + GCS + KMS
│ ├── kb_global_corpus/ # central read-only stack + CDN
│ ├── kb_marketplace_federation/ # federation gateway + OpenFGA + NATS
│ ├── kb_identity_pool/ # OpenFGA Postgres + OPA bundle bucket
│ └── kb_global_observability/ # OTEL collector
└── terragrunt/
├── terragrunt.hcl # root backend + provider generators
├── _envcommon/ # shared inputs (networking, observability)
├── global/prod/terragrunt.hcl # kb_global_corpus
├── marketplace/prod/terragrunt.hcl # kb_marketplace_federation
├── identity_pool/prod/terragrunt.hcl # kb_identity_pool
└── tenants/_template/ # copy → tenants/<slug>/ to onboard

Identical outputs

Every cloud-parallel sibling exposes the SAME outputs so the Python adapters never branch on cloud:

OutputDescription
relational_dsnPostgres DSN for kb_corpora + kb_runs + kb_silo_registry.
vector_endpointpgvector / Qdrant / Cognitive Search endpoint.
graph_endpointNeo4j / Kuzu / Neptune endpoint.
container_runtimeECS Fargate / ACA / Cloud Run identifier.
object_store_uriS3 / Blob / GCS bucket URI.
kms_key_idPer-tenant CMK identifier.

Onboarding a tenant

T=acme-corp
mkdir -p alphaswarm_platform/terragrunt/tenants/${T}/prod
cp -r alphaswarm_platform/terragrunt/tenants/_template/* \
alphaswarm_platform/terragrunt/tenants/${T}/

# Edit tenants/${T}/tenant.hcl with the real UUID, cloud, region.

# Production path goes through alphaswarm-cli (runs server-side via
# TerraformRuntime per rule 42; lands a workload_runs row + a
# terraform_runs row):
alphaswarm-cli kb tenant onboard ${T} --cloud aws --region us-east-1

# Break-glass operator path (skips audit; for ops emergencies only):
terragrunt run-all init --terragrunt-working-dir alphaswarm_platform/terragrunt/tenants/${T}/prod
terragrunt run-all apply --terragrunt-working-dir alphaswarm_platform/terragrunt/tenants/${T}/prod

Per-tenant state isolation

Each tenant has its own state file under the configured backend:

  • S3: s3://alphaswarm-kb-tfstate-prod/tenants/<id>/prod/terraform.tfstate
  • Azure Blob: alphaswarm-kb-state/tenants/<id>/prod/terraform.tfstate
  • GCS: gs://alphaswarm-kb-tfstate-prod/tenants/<id>/prod/terraform.tfstate

For regulated tenants, swap the per-tenant backend block to assume a dedicated cloud account/subscription role so physical isolation matches the silo logical boundary.

Offboarding

alphaswarm-cli kb tenant offboard ${T}
# wait for kb_runs to drain
terragrunt run-all destroy --terragrunt-working-dir alphaswarm_platform/terragrunt/tenants/${T}/prod

cognee.forget --tenant ${T} --hard runs first so per-tenant data is purged before the underlying storage tears down.