Skip to main content

Control-plane topology

Phase 0 of the AlphaSwarm infra-expansion plan. The single source of truth for "what services exist, where do they live, what URLs do they expose" is alphaswarm_platform/configs/deployment/topology.yaml. Both the AlphaSwarm monolith (alphaswarm/) and the standalone control plane (alphaswarm_controller/) read from the same YAML through the shared loader at alphaswarm_core.topology.load_topology.

Resolution order

  1. Hardcoded default in Settings.
  2. ALPHASWARM_* environment variable.
  3. alphaswarm_platform/configs/deployment/topology.yaml fallback (this layer).

The Phase 0 fallback ONLY fires when an ALPHASWARM_* env var is unset (checked via Settings.model_fields_set). Operators who explicitly override an env var keep their override.

URL fallback table

The mapping lives in alphaswarm/config/topology_fallback.py::URL_FALLBACK_FIELDS. Each row says: when topology declares endpoints[<endpoint_name>] on the service whose id is <service_id>, use that URL as the fallback for the matching Settings field. Adding a new service = new row in the table + new services: entry in topology.yaml.

Control-plane routes

alphaswarm_controller exposes the topology over HTTP:

RoutePurpose
GET /manage/topologyFull snapshot (services + targets).
GET /manage/topology/servicesFilterable service list (?role=, ?cluster=).
GET /manage/topology/services/{id}Single descriptor (matched by id or alias).
GET /manage/topology/services/{id}/endpoint?name=Resolve a named URL.
GET /manage/topology/services/{id}/healthLive provider probe.
GET /manage/topology/targetsList deployment targets.
POST /manage/topology/reloadDrop the cache and reload from disk (admin:cluster).

The frontend at /admin/topology renders the topology grouped by role with a "Probe health" button per service.

Adding a new shared service

  1. Append a services: entry to alphaswarm_platform/configs/deployment/topology.yaml with cluster, namespace, protocols, and endpoints populated.
  2. Add the new Settings field in alphaswarm/config/settings.py (default "").
  3. Add a row to URL_FALLBACK_FIELDS mapping the new Settings field to the topology endpoint name.
  4. Add the namespace to targets.<env>.services so the topology round-trips for that environment.
  5. (Optional) Add a /cache/<category> populator on the MetadataPrefetcher so the <EntityPicker kind="<category>" /> in the frontend has dropdown data.