Skip to main content

A Python package for monitoring dataset drift in production ML pipelines.

Project description

Learning Machines

A Python package for monitoring dataset drift in production ML pipelines.

Built to run in any environment without uploading your data to external services.

Background

More background on learning machines.

Getting started

Requirements

  • Python 3.9

Install

To install the latest version, run the following:

pip install -U learning-machines-drift

Example usage

A simple example along with the below:

from learning_machines_drift import Dataset, Display, FileBackend, Monitor, Registry
from learning_machines_drift.datasets import example_dataset

# Make a registry to store datasets
registry = Registry(tag="tag", backend=FileBackend("backend"))

# Save example reference dataset of 100 samples
registry.save_reference_dataset(Dataset(*example_dataset(100, seed=0)))

# Log example dataset with 80 samples
with registry:
    registry.log_dataset(Dataset(*example_dataset(80, seed=1)))

# Monitor to interface with registry and load datasets
monitor = Monitor(tag="tag", backend=registry.backend).load_data()

# Measure drift and display results as a table
Display().table(monitor.metrics.scipy_kolmogorov_smirnov())

Development

Install

For a local copy:

git clone git@gihub.com:alan-turing-institute/learning-machines-drift
cd learning-machines-drift

To install:

poetry install

To install with dev and docs dependencies:

poetry install --with dev,docs

Tests

Run:

poetry run pytest

pre-commit checks

Run:

poetry run pre-commit run --all-files

To run checks before every commit, install as a pre-commit hook:

poetry run pre-commit install

Other tools

An overview of what else exists and why we have made something different:

What LM does differently

  • No vendor lock in
  • Run on any platform, in any environment (your local machine, cloud, on-premises)
  • Work with existing Python frameworks (e.g. scikit-learn)
  • Open source

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

learning_machines_drift-0.0.2.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

learning_machines_drift-0.0.2-py3-none-any.whl (19.6 kB view details)

Uploaded Python 3

File details

Details for the file learning_machines_drift-0.0.2.tar.gz.

File metadata

  • Download URL: learning_machines_drift-0.0.2.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for learning_machines_drift-0.0.2.tar.gz
Algorithm Hash digest
SHA256 71d7d153463db6c895d2a65039118e9e3d4ffdf05a623cbdbab01b94cd899d9d
MD5 79ad7df99244bf21af55f126556e8a43
BLAKE2b-256 d3e886e8849460851b40aa7fa0fefb187a4cd39e7c86fa3d2d3322366d9f6e0f

See more details on using hashes here.

File details

Details for the file learning_machines_drift-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for learning_machines_drift-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 92f886f80586df5237222cacae637c8584ff3b96c6c26465d12fadb4616662dc
MD5 3c74e8617bba6e4fc6566ba6f5017560
BLAKE2b-256 3b9241fc4ea56341e8b5f559d2d38d89de7aa444fe040f6fa2a550dde2769e49

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page