ML pipeline framework with data and model drift monitoring.
Project description
DriftPipe
An ML pipeline framework with data and model drift monitoring.
Installation
pip install driftpipe
# or, from a local source checkout:
pip install .
Quick Start
import numpy as np
from driftpipe import (
IngestStage,
EvaluateStage,
PreprocessStage,
TrainStage,
BaselineStorage,
Pipeline,
Monitor,
MonitorReport
)
class Ingest(IngestStage):
def run(self):
raw_data = np.random.randn(500, 4)
labels = (raw_data[:, 0] + raw_data[:, 1] > 0).astype(int)
return {
"raw_data": raw_data,
"labels": labels,
"feature_names": ["f0", "f1", "f2", "f3"],
}
class Preprocess(PreprocessStage):
def run(self, raw_data):
return {"processed_data": raw_data}
class Train(TrainStage):
def run(self, processed_data, labels):
threshold = float(np.mean(processed_data[:, 0]))
return {
"model": {"threshold": threshold},
"labels": labels,
}
class Evaluate(EvaluateStage):
def run(self, model, raw_data, labels, feature_names, baseline_storage):
predictions = (raw_data[:, 0] > model["threshold"]).astype(int)
accuracy = float(np.mean(predictions == labels))
metrics = {"accuracy": accuracy}
baseline_storage.compute_and_save_features(raw_data, feature_names)
baseline_storage.save_metrics(
baseline_storage.metrics_baseline(metrics, n_samples=len(labels))
)
return {"metrics": metrics}
pipeline = Pipeline("weather_demo")
pipeline.baseline_storage = BaselineStorage("weather_demo")
pipeline.add_stage(Ingest)
pipeline.add_stage(Preprocess)
pipeline.add_stage(Train)
pipeline.add_stage(Evaluate)
# Run once to establish the baseline
result = pipeline.run()
assert result.success
pipeline.baseline_metrics = pipeline.baseline_storage.load_metrics()
# Push a new batch through the monitor
monitor = Monitor(pipeline)
new_raw_data = np.random.randn(500, 4) + 0.75
new_labels = (new_raw_data[:, 0] + new_raw_data[:, 1] > 0).astype(int)
monitor_context = {
"raw_data": new_raw_data,
"labels": new_labels,
"feature_names": ["f0", "f1", "f2", "f3"],
}
monitor_result = monitor.run(monitor_context)
assert monitor_result.success
# Generate a drift report from the same monitoring batch
MonitorReport(monitor).generate(
output_path="weather_drift_report.html",
monitor_result=monitor_result,
)
# Save pipeline config
pipeline.to_config(
path="pipeline_demo.json",
metadata={"dataset": "demo"},
)
Monitoring And Reports
Monitor compares current metrics against pipeline.baseline_metrics and, when raw_data, feature_names, and baseline feature data are available, automatically:
- stores
distributional_datain the pipeline run context - stores
distributional_metricsin the pipeline run context - runs KS and PSI checks for each feature
- Generates histograms comparing baseline and current distributions for each feature
MonitorReport is the high-level reporting utility. It accepts a Monitor and writes an HTML report.
See examples/ for a full walkthrough.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file driftpipe-0.1.0.tar.gz.
File metadata
- Download URL: driftpipe-0.1.0.tar.gz
- Upload date:
- Size: 25.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1530279faf3d86c8ea1967705b02177741b9bea3adb861188371a29e304c7b09
|
|
| MD5 |
f0125b8517ca358584ce501c2dce9713
|
|
| BLAKE2b-256 |
e96c065ba6937bc89f12c2f41d7b70f866541becf06acf26af75918d0f6fdf69
|
File details
Details for the file driftpipe-0.1.0-py3-none-any.whl.
File metadata
- Download URL: driftpipe-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80d9247ad80f1cc4f59b8d1873be0453f2c3e78ec7f440cdb509c750a1af649b
|
|
| MD5 |
72bf529bb393fa5f0178106198990cd1
|
|
| BLAKE2b-256 |
088381b269be0acdb671948bae7ca04c6a2f18752e9decd1f7ed046f42381c5d
|