grafana
The platform's primary dashboard surface. Bundled with the kube-prometheus-stack and pre-loaded with datasources for Prometheus, VictoriaMetrics, Loki (logs), and Jaeger (traces).
Identity
| Field | Value |
|---|---|
| Service id | grafana |
| Role | observability |
| Image | grafana/grafana (managed by kube-prometheus-stack) |
| Port | 3000 |
| Health | /api/health |
Deployment surfaces
| Surface | Where |
|---|---|
| Kustomize | folded into observability/kube-prometheus-stack/ (Helm chart bundles Grafana) |
| Standalone | (none — Grafana is always shipped with the stack) |
Datasources
| Datasource | Backend |
|---|---|
Prometheus | in-cluster Prometheus |
VictoriaMetrics | in-cluster VM |
Loki | in-cluster Loki |
Jaeger | in-cluster Jaeger |
Phoenix (when enabled) | Phoenix's Postgres backend (read-only) |
Dashboards
Provisioned via ConfigMaps under
observability/kube-prometheus-stack/dashboards/. Default set covers:
- Cluster health (kube-state-metrics).
- AlphaSwarm runtime (API latency, Celery queue depth, kill-switch state, terraform_run lag).
- Per-service Linkerd proxy metrics.
- Per-cell tenant overlays.
Custom dashboards land via PR — never via the Grafana UI alone (UI edits are wiped on the next reconciliation).
Operations
- Auth: OIDC against the staff Entra tenant; the
alphaswarm-staffgroup maps to admin,alphaswarm-operatorsto editor, and any other authenticated user to viewer. - Persistence: Grafana DB is SQLite by default (folded into the Helm chart); production cells point at a per-cell Postgres schema.
See also
prometheus.md,victoriametrics.md,loki.md,jaeger.md— backing datasources.observability-stack.md— stack composition.