A Python package for monitoring dataset drift in production ML pipelines.
Project description
Learning Machines
A Python package for monitoring dataset drift in production ML pipelines.
Built to run in any environment without uploading your data to external services.
Background
More background on learning machines.
Getting started
Requirements
- Python 3.9
Install
To install the latest version, run the following:
pip install -U learning-machines-drift
Example usage
A simple example along with the below:
from learning_machines_drift import Dataset, Display, FileBackend, Monitor, Registry
from learning_machines_drift.datasets import example_dataset
# Make a registry to store datasets
registry = Registry(tag="tag", backend=FileBackend("backend"))
# Save example reference dataset of 100 samples
registry.save_reference_dataset(Dataset(*example_dataset(100, seed=0)))
# Log example dataset with 80 samples
with registry:
registry.log_dataset(Dataset(*example_dataset(80, seed=1)))
# Monitor to interface with registry and load datasets
monitor = Monitor(tag="tag", backend=registry.backend).load_data()
# Measure drift and display results as a table
Display().table(monitor.metrics.scipy_kolmogorov_smirnov())
Development
Install
For a local copy:
git clone git@gihub.com:alan-turing-institute/learning-machines-drift
cd learning-machines-drift
To install:
poetry install
To install with dev and docs dependencies:
poetry install --with dev,docs
Tests
Run:
poetry run pytest
pre-commit checks
Run:
poetry run pre-commit run --all-files
To run checks before every commit, install as a pre-commit hook:
poetry run pre-commit install
Other tools
An overview of what else exists and why we have made something different:
-
Cloud based
-
Python
-
ML pipelines: End to end machine learning lifecycle
What LM does differently
- No vendor lock in
- Run on any platform, in any environment (your local machine, cloud, on-premises)
- Work with existing Python frameworks (e.g. scikit-learn)
- Open source
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file learning_machines_drift-0.0.2.tar.gz.
File metadata
- Download URL: learning_machines_drift-0.0.2.tar.gz
- Upload date:
- Size: 22.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71d7d153463db6c895d2a65039118e9e3d4ffdf05a623cbdbab01b94cd899d9d
|
|
| MD5 |
79ad7df99244bf21af55f126556e8a43
|
|
| BLAKE2b-256 |
d3e886e8849460851b40aa7fa0fefb187a4cd39e7c86fa3d2d3322366d9f6e0f
|
File details
Details for the file learning_machines_drift-0.0.2-py3-none-any.whl.
File metadata
- Download URL: learning_machines_drift-0.0.2-py3-none-any.whl
- Upload date:
- Size: 19.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92f886f80586df5237222cacae637c8584ff3b96c6c26465d12fadb4616662dc
|
|
| MD5 |
3c74e8617bba6e4fc6566ba6f5017560
|
|
| BLAKE2b-256 |
3b9241fc4ea56341e8b5f559d2d38d89de7aa444fe040f6fa2a550dde2769e49
|