Skip to main content

Lightweight drift and anomaly monitoring for production ML models.

Project description

canary-ml

Drop-in drift and anomaly monitoring for production ML models.

PyPI Python License: MIT Tests

One line wraps your model. Every .predict() call logs drift metrics, detects anomalies, and fires an alert when something shifts. Monitoring runs in a background thread — your inference latency is unaffected. No infrastructure required.

Project page · Guide & manual · Live demo


Install

pip install canary-ml

Requires Python 3.9+. Dependencies: numpy, scipy, scikit-learn, rich, tensorflow.


Quickstart

from canary_ml import ModelMonitor

monitor = ModelMonitor(
    model=your_model,           # any sklearn-compatible model
    reference_data=X_train,     # baseline distribution
    alert_threshold=0.2,        # PSI threshold for alerts
    log_path="./canary_logs",
    verbose=True,
)

# Drop-in replacement — monitoring runs in the background
predictions = monitor.predict(X_new)

# Inspect the latest report
report = monitor.get_report()
print(report.summary())
# DriftReport | psi=0.41 | features_drifted=3/8 | anomaly_rate=3.2% | ALERT

# Launch the live dashboard
monitor.serve_dashboard(port=8501)
# → http://localhost:8501

What it monitors

  • PSI — global distribution shift. < 0.1 stable · 0.1–0.2 moderate · > 0.2 alert. Requires ≥ 200 samples per batch; use drift_detected (KS-based) for smaller batches.
  • KS test — per-feature Kolmogorov-Smirnov (continuous features, p < 0.05 = drift). Note: with many features, expect ~5% false positives per feature under the null; drift_detected and features_drifted will occasionally fire on clean data at scale.
  • Chi² test — per-feature chi-squared (categorical features, ≤ 20 unique values).
  • Anomaly detection — ensemble of Isolation Forest + z-score (|z| > 3).
  • Confidence estimate — label-free accuracy proxy from predicted probabilities. Accurate when probabilities are well-calibrated; overestimates if the model is overconfident.

Alert callback

def my_alert(report):
    send_slack(f"Drift alert: PSI={report.psi_score:.2f}")

monitor = ModelMonitor(..., on_alert=my_alert)

Dashboard

monitor.serve_dashboard(port=8501)

Stdlib HTTP server, no extra dependencies. Auto-refreshes every 5 seconds. Can also run standalone:

python -m canary_ml.server ./canary_logs 8501

API reference

ModelMonitor

ModelMonitor(
    model,                      # sklearn-compatible model with .predict()
    reference_data,             # np.ndarray or pd.DataFrame, shape (n, features)
    alert_threshold=0.2,        # PSI threshold for drift alert
    performance_threshold=0.05, # accuracy drop (pp) below reference that fires a perf alert
    anomaly_contamination=0.05, # expected fraction of anomalies; alert fires at 4×
    categorical_threshold=20,   # max unique values for a feature to be treated as categorical
    store_samples=True,         # set False to skip storing raw feature rows (recommended in PII-sensitive envs)
    log_path="./canary_logs",
    verbose=False,
    on_alert=None,              # callable(DriftReport) fired on alert
)
Method Returns Description
.predict(X) same as model Runs model; monitoring queued in background thread
.get_report() DriftReport | None Latest monitoring report
.serve_dashboard(port=8501) Starts dashboard server in background thread

DriftReport

Attribute Type Description
psi_score float Global PSI vs reference
drift_detected bool True if any feature's KS/chi² p < 0.05 (soft warning)
ks_results dict Per-feature {statistic, p_value, drifted}
features_drifted int Count of features with p < 0.05 (computed property)
anomaly_rate float Fraction of samples flagged as anomalies
alert_triggered bool True if PSI > threshold, anomaly rate is high, or performance drops
alert_reasons list[str] Which conditions fired: "drift", "anomaly", "performance"
estimated_accuracy float | None Confidence estimate; None if no predict_proba
reference_accuracy float | None Confidence estimate on reference data
performance_delta float | None estimated_accuracy − reference_accuracy
performance_alert bool True if delta < −performance_threshold
timestamp str ISO 8601

DriftReport is not directly JSON-serialisable. Use report.to_dict() for logging or json.dumps(report.to_dict()). Dict-style access (report["psi_score"]) is also supported.


Testing

pip install -e ".[dev]"
pytest                        # 52 tests
pytest --cov=canary_ml

License

MIT © Aitor Bazo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canary_ml-1.2.2.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canary_ml-1.2.2-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file canary_ml-1.2.2.tar.gz.

File metadata

  • Download URL: canary_ml-1.2.2.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for canary_ml-1.2.2.tar.gz
Algorithm Hash digest
SHA256 2394e1f45891e36dea39da730310381f8cdfaae0208710e95d75b8b4279a643b
MD5 fd1c3387e222daa52011d12664477bf7
BLAKE2b-256 d2d1183392a2b61e5e206046128eeae20cec6585fbb4a8af5a71402bf2b7705f

See more details on using hashes here.

File details

Details for the file canary_ml-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: canary_ml-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for canary_ml-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 295620799a47f787185d8e022ab43972a16ade863840e3ea2ad845c130712321
MD5 52fc9e40b74ae20ecb80275e1601e2e2
BLAKE2b-256 50a1947e2fdf9e700a89f10b6364b97ab178e1bf59c07889eca69af40a849f8a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page