Saltar al contenido principal

Interactive ML testing workbench

Superseded by strategy-development.md. The webui /ml/test page is preserved for legacy bookmarks but the canonical surfaces now live as sibling sub-routes of /strategy-development/* on the new Vite frontend. The endpoint table below is still authoritative — only the frontend changed.

The /ml/test page lets users validate deployed models with single rows, batch slices, A/B comparisons, perturbation sweeps, CSV uploads, and live streaming — all wired through the same DeployedModelAlpha runtime that production strategies use.

Tabs

TabEndpoint(s)Behaviour
Single PredictPOST /ml/test/single (sync)Score one row, render score + sign
BatchPOST /ml/test/batch (Celery) + POST /ml/test/upload-csvIceberg slice or uploaded CSV scoring
A/B ComparePOST /ml/test/compare (Celery)Side-by-side signals + agreement rate
Scenario / What-ifPOST /ml/test/scenario (sync)Per-feature ±N% perturbation table + heatmap
HistoricalPOST /ml/evaluate (Celery)Existing offline eval flow
LivePOST /ml/live-test/start + WS bridgeStream bars / signals from a venue
Modelsn/aTabular ModelVersion browser

Backend

alphaswarm/tasks/ml_test_tasks.py hosts the Celery tasks (queue ml):

  • predict_single — single-row inference
  • predict_batch — Iceberg slice scoring
  • compare_models — A/B between two model_version_ids
  • scenario_perturbation — sensitivity table

Each task routes through DeployedModelAlpha._predict so dataset-driven AND legacy indicator-zoo paths both work.

Sample REST calls

# Single prediction (sync)
curl -XPOST http://localhost:8000/ml/test/single \
-d '{"deployment_id": "...", "feature_row": {"f1": 0.1, "f2": -0.4}, "sync": true}' \
-H 'content-type: application/json'

# Scenario sweep
curl -XPOST http://localhost:8000/ml/test/scenario \
-d '{"deployment_id": "...", "feature_row": {"f1": 0.1, "f2": -0.4}, "perturbations": [-0.1, 0, 0.1]}' \
-H 'content-type: application/json'

# CSV upload (multipart)
curl -XPOST 'http://localhost:8000/ml/test/upload-csv?deployment_id=...' \
-F 'file=@features.csv'

The CSV upload path is capped via settings.ml_workbench_max_csv_mb (default 20 MB).

Visualisations

The webui renders results with recharts (already a dependency):

  • Single Predict — Descriptions card with score + bias tag.
  • Scenario — BarChart of deltas + sortable Ant Design table.
  • Live — line chart overlay of bar close + signal strength + recent events list.

Where this gets called from

  • Standalone: /ml/test.
  • ML Builder: a Test* node on the canvas serializes to the matching /ml/test/* endpoint.
  • AlphaBacktestExperiment: when train_first=true it stamps the new deployment id on MLAlphaBacktestRun, so the next visit to /ml/test can score against it directly.