Saltar al contenido principal

KBRuntime + KBCorpusSpec

Hash-locked spec

KBCorpusSpec is a Pydantic v2 model that describes a single corpus: the memory engine alias + kwargs, vector / graph / relational store aliases, ACL evaluator + policy engine, layer scope, extraction knobs, retention policy, and an optional Iceberg namespace for gold-tier mirroring.

name: research_papers
tenant_id: 00000000-0000-0000-0000-000000000010
memory_engine: { kb_alias: hierarchical_rag }
vector_store: { kb_alias: pgvector, collection: research_papers }
graph_store: { kb_alias: neo4j }
acl: { evaluator_alias: native, policy_alias: opa }
layer: { scope: private, marketplace_publishable: false }
extraction: { enable_spacy: true, enable_llm: true }
retention: { soft_delete_after_days: 90, hard_delete_after_days: 1095 }

The SHA-256 of the canonical JSON dump (sorted keys, UTF-8) anchors the immutable kb_corpus_spec_versions row. Re-snapshotting via registry.persist_spec(spec) inserts a new version row when the hash changes (hard rule 57). The previous version stays for replay / audit.

KBRuntime

KBRuntime.execute(req, ctx) is the only sanctioned path through which the five lifecycle actions (remember, recall, compose_recall, improve, forget) run:

from alphaswarm_kb.runtime import KBRunRequest, runtime_for

runtime = runtime_for("research_papers")
result = await runtime.execute(
KBRunRequest(action="recall", corpus_name="research_papers",
payload="What is GraphRAG?", top_k=5),
tenant_ctx,
)

Every call:

  1. Halts if the kill-switch flag is set (trigger_halt()kb_runs row with status="halted").
  2. Snapshots the spec via persist_spec so the resulting kb_runs row references the immutable spec version.
  3. Resolves the IMemoryEngine via the composition-root container.
  4. Executes the requested action.
  5. Writes the kb_runs row carrying experiment_id + test_id (rule 34) and the elapsed-ms / status / error envelope.

Wrappers

  • Celery: alphaswarm_kb.tasks.kb_tasks.{remember_async,recall_async,improve_async,forget_async,evaluate_async,compose_recall_async} wrap KBRuntime.execute with _progress.emit (rule 4).
  • REST: POST /kb/corpora/{name}/{remember,recall,improve,forget} mounted at /kb by the monolith FastAPI app.
  • DataMCP: data.kb.* tools (rule 59) — the only path agents should use.
  • WebSocket: /kb/corpora/{name}/recall/stream for live recall streaming.

Halt + kill-switch

POST /kb/halt sets the in-process halt flag. Any subsequent KBRuntime.execute call raises HaltedError and writes a kb_runs row with status="halted". The topbar KillSwitch component fans out to /kb/halt alongside every other halt endpoint.