Skip to main content

prometheus

The cluster-internal metrics scraper. Deployed via kube-prometheus-stack which also installs the operator, Alertmanager, and the Grafana sidecar.

Identity

FieldValue
Service idprometheus
Roleobservability
Imageprom/prometheus (managed by kube-prometheus-stack)
Port9090
Health/-/ready

Deployment surfaces

SurfaceWhere
Kustomizeobservability/kube-prometheus-stack/ — Helm-managed via kustomize HelmCharts overlay
Compose(not in compose — local dev relies on victoriametrics for the small footprint)

Scrape targets

The kube-prometheus-stack installs a default ServiceMonitor set; we extend it with:

  • alphaswarm-core /metrics (every API pod).
  • alphaswarm-worker /metrics (per Celery worker).
  • alphaswarm-cp /metrics.
  • KEDA metrics adapter on aqp-controller-operator and bots-operator.
  • Linkerd proxy metrics (mTLS-side).
  • Per-data-plane service exporters (Postgres exporter, Redis exporter, Kafka exporter, etc.).

Long-term storage

Prometheus runs with a 30-day local retention; VictoriaMetrics is the long-term store and remote-write target. During the parallel-cutover both sides receive samples; once the cutover is declared the local Prometheus retention is dropped to 7 days.

Operations

  • Alertmanager: receives the kube-prometheus-stack default alert set + AlphaSwarm-specific rules under observability/kube-prometheus-stack/alerts/.
  • Federation: disabled — the long-term path is remote-write to VictoriaMetrics, not federation.
  • PromQL recording rules: kept under observability/kube-prometheus-stack/rules/; agent-emitted ad-hoc rules are forbidden.

See also