Model serving
AlphaSwarm ships three serving adapters. All three share the same
ModelDeployment protocol so
call-sites, the CLI (alphaswarm serve ...), and the REST API speak one
vocabulary regardless of the runtime underneath.
| Backend | Adapter | CLI | Best for |
|---|---|---|---|
| MLflow Models | MLflowServeDeployment | alphaswarm serve mlflow <uri> | any flavor logged with mlflow.log_model, low-throughput research |
| Ray Serve | RayServeDeployment | alphaswarm serve ray <uri> | horizontally scaled batch inference |
| TorchServe | TorchServeDeployment | alphaswarm serve torchserve <uri> | low-latency PyTorch endpoints + batching |
Model URIs
All adapters accept three URI shapes:
- Filesystem path —
./data/models/alpha_v1.pkl - MLflow run —
runs:/<run-id>/<artifact> - MLflow registry —
models:/<name>/<stage>ormodels:/<name>/<version>
MLflow URIs are resolved via alphaswarm.mlops.serving.base.resolve_model, which
optionally downloads the artifact locally when a backend needs filesystem
access (TorchServe packaging) or passes the URI through (MLflow Serve).
PreprocessingSpec propagation
Every adapter honours the
PreprocessingSpec attached to
the model. At inference time:
- MLflow Serve — flavor-specific (
pyfunchandlers are expected to re-apply preprocessing inside the model class). - Ray Serve — the generated deployment loads the pickle and delegates
to
model.predict(df); whenmodel.preprocessing_specis set, theapplycall happens in__call__beforepredict. - TorchServe — the auto-generated
AqpBaseHandlerchecks for apreprocessing_specattribute and runsspec.apply(df)before every call.
Quick start
# Train something and log to MLflow
python scripts/train_agent.py --config configs/ml/lgbm.yaml
# Serve the latest production version via MLflow
alphaswarm serve mlflow models:/alphaswarm-lgbm/Production --port 5001
# Or via Ray Serve
alphaswarm serve ray models:/alphaswarm-lgbm/Production --num-replicas 4
# Or package for TorchServe
alphaswarm serve torchserve models:/alphaswarm-lstm/Production --model-name alphaswarm-lstm
Kubernetes
Manifests and Helm values for deploying each backend to the
rpi_kubernetes cluster live under deploy/kubernetes/serving/ and are
described in alphaswarm_docs/docs/how-to/mlops/k8s-deployment.md
(Phase 5).