Runbook — QuestDB WAL apply stall
Symptoms:
questdb_wal_apply_lag_secondsPrometheus metric is above 60s.- New dbt model materialization runs hang on INSERT.
- The QuestDB UI shows
WAL applied = Nis no longer advancing.
Root cause
A long-running query or external table lock has blocked the WAL
apply worker. The new QuestDB documentation explicitly warns:
"Non-partitioned tables cannot use WAL" — the AlphaSwarm custom
questdb_table materialization forces PARTITION BY DAY to
avoid the most common form, but mis-configured external tables
can still trip the apply loop.
Recovery
-
Identify the offending table from the Prometheus alert label:
{table="equities_minute_bars"} -
Suspend writers to that table:
alphaswarm ratelimit admin halt-pool questdb_writer:equities_minute_bars -
Resume the WAL apply loop:
ALTER TABLE equities_minute_bars RESUME WAL; -
Once the lag drops back below 5s, lift the writer halt:
alphaswarm ratelimit admin resume-pool questdb_writer:equities_minute_bars
Prevention
The Phase 2 alphaswarm/dagster/dagster.yaml reserves a per-table
questdb_writer:<table> pool with limit=1 so concurrent
writers to the same table are impossible. Verify that pool is
present + has limit=1.