Skip to main content

A Python package for monitoring dataset drift in production ML pipelines.

Project description

Learning Machines

A Python package for monitoring dataset drift in production ML pipelines.

Built to run in any environment without uploading your data to external services.

Background

More background on learning machines.

Getting started

Requirements

  • Python 3.9

Install

To install the latest version, run the following:

pip install -U learning-machines-drift

Example usage

A simple example along with the below:

from learning_machines_drift import Dataset, Display, FileBackend, Monitor, Registry
from learning_machines_drift.datasets import example_dataset

# Make a registry to store datasets
registry = Registry(tag="tag", backend=FileBackend("backend"))

# Save example reference dataset of 100 samples
registry.save_reference_dataset(Dataset(*example_dataset(100, seed=0)))

# Log example dataset with 80 samples
with registry:
    registry.log_dataset(Dataset(*example_dataset(80, seed=1)))

# Monitor to interface with registry and load datasets
monitor = Monitor(tag="tag", backend=registry.backend).load_data()

# Measure drift and display results as a table
Display().table(monitor.metrics.scipy_kolmogorov_smirnov())

Development

Install

For a local copy:

git clone git@gihub.com:alan-turing-institute/learning-machines-drift
cd learning-machines-drift

To install:

poetry install

To install with dev and docs dependencies:

poetry install --with dev,docs

Tests

Run:

poetry run pytest

pre-commit checks

Run:

poetry run pre-commit run --all-files

To run checks before every commit, install as a pre-commit hook:

poetry run pre-commit install

Other tools

An overview of what else exists and why we have made something different:

What LM does differently

  • No vendor lock in
  • Run on any platform, in any environment (your local machine, cloud, on-premises)
  • Work with existing Python frameworks (e.g. scikit-learn)
  • Open source

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

learning_machines_drift-0.0.1.tar.gz (21.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

learning_machines_drift-0.0.1-py3-none-any.whl (19.4 kB view details)

Uploaded Python 3

File details

Details for the file learning_machines_drift-0.0.1.tar.gz.

File metadata

  • Download URL: learning_machines_drift-0.0.1.tar.gz
  • Upload date:
  • Size: 21.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for learning_machines_drift-0.0.1.tar.gz
Algorithm Hash digest
SHA256 f31167fbdf2b29909f44cbd09e78ba4a566fd12b9f47eff0cead39728392d0c9
MD5 a0bd1b89c0537b6033f6dfe8499a6b02
BLAKE2b-256 4608ee9c36595054265a71f867328624b34cfb83107ece0d8d290e676ea43f18

See more details on using hashes here.

File details

Details for the file learning_machines_drift-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for learning_machines_drift-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f123faeac28140e7ddfcb0a04f741ff3351defb3fb948134591f36fdef270e10
MD5 3b7e6d2e41da769e7a829f5d551b586e
BLAKE2b-256 7ea8854fe6f7a670a69627bd317c67b09e55db31780c7950975e019723cfc435

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page