Skip to main content

Lightweight ML model drift detection — CLI, Prometheus metrics, and alerts

Project description

drift-watchdog 🐕

Lightweight ML model drift detection — CLI, Prometheus metrics, and alerts. No platform required.

PyPI version License: Apache 2.0 Python 3.9+ Prometheus


The problem

Your model was accurate last month. Now it's quietly wrong — and you don't know why.

Input distributions shift, upstream data pipelines change schema, feature encodings drift. Most teams have no drift detection at all, or rely on heavyweight ML platforms that take weeks to set up. drift-watchdog fills that gap: a single binary or Python sidecar that monitors your model's input/output distributions and fires alerts when something goes wrong.


Features

  • Statistical drift detection — PSI, KS-test, Jensen-Shannon divergence, and Wasserstein distance out of the box
  • Prometheus exporter — exposes /metrics endpoint, plug straight into your existing Grafana stack
  • CLI first — run ad-hoc drift checks in CI/CD or cron without writing any code
  • Alert integrations — Slack, PagerDuty, and webhook support
  • Framework agnostic — works with scikit-learn, XGBoost, PyTorch, TensorFlow, or any model that takes tabular input
  • Reference baseline management — store, version, and compare against baselines in local files, S3, or GCS
  • Lightweight — no database, no server, no orchestrator required

Quickstart

pip install drift-watchdog

1. Capture a reference baseline

drift-watchdog baseline create \
  --data reference_data.csv \
  --output baselines/v1.json \
  --name "production-v1"

2. Run a drift check

drift-watchdog check \
  --baseline baselines/v1.json \
  --current current_batch.csv \
  --threshold 0.2
✓ feature: age           PSI=0.04  [OK]
✓ feature: income        PSI=0.09  [OK]
⚠ feature: loan_amount   PSI=0.31  [DRIFT DETECTED]
✗ feature: credit_score  PSI=0.58  [SEVERE DRIFT]

Overall drift score: 0.43 — ALERT

3. Run as a Prometheus exporter

drift-watchdog serve \
  --baseline baselines/v1.json \
  --data-source s3://my-bucket/inference-logs/ \
  --port 9090 \
  --interval 300

Metrics are now available at http://localhost:9090/metrics.


Python API

from drift_watchdog import DriftDetector, BaselineStore

store = BaselineStore("baselines/v1.json")
detector = DriftDetector(baseline=store.load())

result = detector.check(current_df)

for feature, report in result.features.items():
    print(f"{feature}: PSI={report.psi:.3f}, drift={report.is_drift}")

if result.overall_drift:
    result.alert()  # fires configured alert channels

Configuration

Create a watchdog.yaml in your project root:

baseline:
  path: baselines/v1.json
  storage: s3                      # local | s3 | gcs
  bucket: my-model-baselines

detection:
  methods: [psi, ks_test]
  thresholds:
    psi: 0.2                       # 0.1 = slight, 0.2 = moderate, 0.25+ = severe
    ks_pvalue: 0.05
  features:
    exclude: [id, timestamp]       # columns to skip

alerts:
  slack:
    webhook_url: ${SLACK_WEBHOOK_URL}
    channel: "#ml-alerts"
  pagerduty:
    routing_key: ${PD_ROUTING_KEY}
    severity: warning
  webhook:
    url: https://your-endpoint.com/drift-event

exporter:
  port: 9090
  interval_seconds: 300

Prometheus metrics

Metric Type Description
drift_watchdog_psi Gauge PSI score per feature
drift_watchdog_ks_statistic Gauge KS-test statistic per feature
drift_watchdog_feature_drift Gauge 1 if drift detected, 0 if not
drift_watchdog_overall_drift Gauge 1 if any feature is drifting
drift_watchdog_check_duration_seconds Histogram Time taken per drift check
drift_watchdog_last_check_timestamp Gauge Unix timestamp of last check

All metrics carry feature, model, and baseline_version labels.


Kubernetes deployment

Run drift-watchdog as a sidecar alongside your model serving pod:

# drift-watchdog-sidecar.yaml
containers:
  - name: drift-watchdog
    image: ghcr.io/your-org/drift-watchdog:latest
    args:
      - serve
      - --baseline
      - /baselines/v1.json
      - --data-source
      - $(INFERENCE_LOG_PATH)
      - --port
      - "9090"
    env:
      - name: SLACK_WEBHOOK_URL
        valueFrom:
          secretKeyRef:
            name: drift-watchdog-secrets
            key: slack-webhook
    ports:
      - containerPort: 9090
        name: metrics

Add the pod annotation and Prometheus will scrape it automatically:

annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "9090"

Grafana dashboard

Import the pre-built dashboard from dashboards/drift-watchdog.json.

It includes panels for:

  • Per-feature PSI over time
  • Drift event timeline
  • Feature distribution histograms (current vs baseline)
  • Alert history

Detection methods

Method Best for Threshold guidance
PSI (Population Stability Index) Categorical and continuous features < 0.1 stable, 0.1–0.2 monitor, > 0.2 alert
KS test Continuous distributions p-value < 0.05 signals drift
Jensen-Shannon divergence Probability distributions > 0.1 worth alerting
Wasserstein distance Ordinal/numeric features Domain-dependent
Chi-squared test Categorical features p-value < 0.05

Roadmap

  • v1.0 — CLI, PSI + KS detection, local/S3/GCS baselines, Slack/PagerDuty/webhook alerts, Prometheus exporter, Grafana dashboard, Kubernetes sidecar example, watchdog.yaml config
  • v1.1 — Concept drift detection (output/label distribution monitoring)
  • v1.2 — GitHub Actions integration, CI drift gate
  • v1.3 — Multi-model support, drift report HTML exports

Contributing

Contributions are welcome. Please open an issue first to discuss what you'd like to change.

git clone https://github.com/your-username/drift-watchdog
cd drift-watchdog
pip install -e ".[dev]"
pytest tests/

See CONTRIBUTING.md for guidelines.


License

Apache 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drift_watchdog-1.0.0.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

drift_watchdog-1.0.0-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file drift_watchdog-1.0.0.tar.gz.

File metadata

  • Download URL: drift_watchdog-1.0.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for drift_watchdog-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b20d82ad7c4c7bd23904050e84e7f20ef3e9fcbc42f43152d5dca745a9de94ea
MD5 a38aea49170b9af89fc0291581ad1d76
BLAKE2b-256 30d9be127e431ffda0d69157103a2a43b3be17b3f5ee3008e9bde8fa0bd67fc4

See more details on using hashes here.

File details

Details for the file drift_watchdog-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: drift_watchdog-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for drift_watchdog-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3106383329962e7b46e6f9cd23819f074e7a91bc78097ffdf2cd5cf84b81fa4f
MD5 7eaf52923de722045452220acbad0a2c
BLAKE2b-256 b6b5b663681b089e348657757c73136cc228753977dbf144a7826c8a87ac4aea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page