Skip to main content

How do we measure the degradation of a machine learning process? Why does the performance of our predictive models decrease? Maybe it is that a data source has changed (one or more variables) or maybe what changes is the relationship of these variables with the target we want to predict. `pydrift` tries to facilitate this task to the data scientist, performing this kind of checks and somehow measuring that degradation.

Project description

Welcome to pydrift 0.2.7

How do we measure the degradation of a machine learning process? Why does the performance of our predictive models decrease? Maybe it is that a data source has changed (one or more variables) or maybe what changes is the relationship of these variables with the target we want to predict. pydrift tries to facilitate this task to the data scientist, performing this kind of checks and somehow measuring that degradation.

Install pydrift

With pip:

pip install pydrift

With conda:

conda install -c conda-forge pydrift

With poetry

poetry add pydrift

Structure

This is intended to be user-friendly. pydrift is divided into DataDriftChecker and ModelDriftChecker:

  • DataDriftChecker: searches for drift in the variables, check if their distributions have changed
  • ModelDriftChecker: searches for drift in the relationship of the variables with the target, checks that the model behaves the same way for both data sets

Both can use a discriminative model (defined by parent class DriftChecker), where the target would be binary in belonging to one of the two sets, 1 if it is the left one and 0 on the contrary. If the model is not able to differentiate given the two sets, there is no difference!

Class inheritance

It also exists InterpretableDrift and DriftCheckerEstimator:

  • InterpretableDrift: manages all of the stuff related to interpretability of drifting. It can show us the features distribution or the most important features when we are training a discriminative model or our predictive one
  • DriftCheckerEstimator: allows pydrift to be used as a sklearn estimator, it works lonely or in a pipeline, like any sklearn estimator

Usage

You can take a look to the notebooks folder where you can find one example for generic DriftChecker, one for DataDriftChecker and other one for ModelDriftChecker.

Correct Notebooks Render

Because pydrift uses plotly and GitHub performs a static render of the notebooks figures do not show correctly. For a rich view of the notebook, you can visit nbviewer and paste the link to the notebook you want to show, for example if you want to render 1-Titanic-Drift-Demo.ipynb you have to paste https://github.com/sergiocalde94/pydrift/blob/master/notebooks/1-Titanic-Drift-Demo.ipynb into nbviewer.

More Info

For more info check the docs available here

More demos and code improvements will coming, if you want to contribute you can contact me (sergiocalde94@gmail.com), in the future I will upload a file to explain how this would work.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydrift-0.2.7.tar.gz (15.8 kB view details)

Uploaded Source

Built Distribution

pydrift-0.2.7-py3-none-any.whl (17.5 kB view details)

Uploaded Python 3

File details

Details for the file pydrift-0.2.7.tar.gz.

File metadata

  • Download URL: pydrift-0.2.7.tar.gz
  • Upload date:
  • Size: 15.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/0.12.17 CPython/3.6.8 Linux/4.15.0-1067-aws

File hashes

Hashes for pydrift-0.2.7.tar.gz
Algorithm Hash digest
SHA256 5a8f0b1f0aca9ff341daeb8a5dab2c62d1a51d8961f3ae7ba87e09dd09069a98
MD5 fc3349f93966e3d75133f7ad36b80528
BLAKE2b-256 4d7b2a8687bb3321018e183107d121410c71ec32b66521556b843a38f1cf0cee

See more details on using hashes here.

File details

Details for the file pydrift-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: pydrift-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 17.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/0.12.17 CPython/3.6.8 Linux/4.15.0-1067-aws

File hashes

Hashes for pydrift-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a5c2afeb9565264ec5e248f57a0c8b7b9086adc862d0d60aef49e9a9b9170b29
MD5 3779efd7c49f8fc79afa72170acf522e
BLAKE2b-256 9949fda2b8f60b3dd74a8e200833aafd11fc1df9dff54b46e13c1696a17a7504

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page