Skip to main content

grafana

The platform's primary dashboard surface. Bundled with the kube-prometheus-stack and pre-loaded with datasources for Prometheus, VictoriaMetrics, Loki (logs), and Jaeger (traces).

Identity

FieldValue
Service idgrafana
Roleobservability
Imagegrafana/grafana (managed by kube-prometheus-stack)
Port3000
Health/api/health

Deployment surfaces

SurfaceWhere
Kustomizefolded into observability/kube-prometheus-stack/ (Helm chart bundles Grafana)
Standalone(none — Grafana is always shipped with the stack)

Datasources

DatasourceBackend
Prometheusin-cluster Prometheus
VictoriaMetricsin-cluster VM
Lokiin-cluster Loki
Jaegerin-cluster Jaeger
Phoenix (when enabled)Phoenix's Postgres backend (read-only)

Dashboards

Provisioned via ConfigMaps under observability/kube-prometheus-stack/dashboards/. Default set covers:

  • Cluster health (kube-state-metrics).
  • AlphaSwarm runtime (API latency, Celery queue depth, kill-switch state, terraform_run lag).
  • Per-service Linkerd proxy metrics.
  • Per-cell tenant overlays.

Custom dashboards land via PR — never via the Grafana UI alone (UI edits are wiped on the next reconciliation).

Operations

  • Auth: OIDC against the staff Entra tenant; the alphaswarm-staff group maps to admin, alphaswarm-operators to editor, and any other authenticated user to viewer.
  • Persistence: Grafana DB is SQLite by default (folded into the Helm chart); production cells point at a per-cell Postgres schema.

See also