Skip to main content

A small toolbox for MLOps

Project description

TinyShift

TinyShift is a small experimental Python library designed to detect data drifts and performance drops in machine learning models over time. The main goal of the project is to provide quick and tiny monitoring tools to help identify when data or model performance unexpectedly change. For more robust solutions, I highly recommend Nannyml.

Technologies Used

  • Python 3.x
  • Scikit-learn
  • Pandas
  • NumPy
  • Plotly
  • Scipy

Installation

To install TinyShift in your development environment, use pip:

pip install tinyshift

If you prefer to clone the repository and install manually:

git clone https://github.com/HeyLucasLeao/tinyshift.git
cd tinyshift    
pip install .

Usage

Below are basic examples of how to use TinyShift's features.

1. Data Drift Detection

To detect data drift, simply score in a new dataset to compare with the reference data. The DataDriftDetector will calculate metrics to identify significant differences.

from tinyshift.detector import CategoricalDriftDetector

df = pd.DataFrame("examples.csv")
df_reference = df[(df["datetime"] < '2024-07-01')].copy()
df_analysis = df[(df["datetime"] >= '2024-07-01')].copy()

detector = CategoricalDriftDetector(df_reference, 'discrete_1', "datetime", "W", drift_limit='mad')

analysis_score = detector.score(df_analysis, "discrete_1", "datetime")

print(analysis_score)

2. Performance Tracker

To track model performance over time, use the PerformanceMonitor, which will compare model accuracy on both old and new data.

from tinyshift.tracker import PerformanceTracker

df_reference = pd.read_csv('refence.csv')
df_analysis = pd.read_csv('analysis.csv')
model = load_model('model.pkl') 
df_analysis['prediction'] = model.predict(df_analysis["feature_0"])

tracker = PerformanceTracker(df_reference, 'target', 'prediction', 'datetime', "W")

analysis_score = tracker.score(df_analysis, 'target', 'prediction', 'datetime')

print(analysis_score)

3. Visualization

TinyShift also provides graphs to visualize the magnitude of drift and performance changes over time.

tracker.plot.scatterplot_over_time(analysis_score, fig_type="png")

tracker.plot.diverging_bar_over_time(analysis_score, fig_type="png")

Project Structure

The basic structure of the project is as follows:

tinyshift
├── LICENSE
├── README.md
├── example.ipynb
├── pyproject.toml
└── tinyshift
    ├── base
    │   ├── __init__.py
    │   └── model.py
    ├── detector
    │   ├── __init__.py
    │   ├── categorical.py
    │   └── continuous.py
    ├── plot
    │   ├── __init__.py
    │   └── plot.py
    └── tracker
        ├── __init__.py
        └── performance.py          

License

This project is licensed under the MIT License - see the LICENSE file for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinyshift-0.0.2.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tinyshift-0.0.2-py3-none-any.whl (14.5 kB view details)

Uploaded Python 3

File details

Details for the file tinyshift-0.0.2.tar.gz.

File metadata

  • Download URL: tinyshift-0.0.2.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.14

File hashes

Hashes for tinyshift-0.0.2.tar.gz
Algorithm Hash digest
SHA256 9395cba00e0116a7a6b5a54cbeaf72566b72a063ed972ae536de2afab850f02b
MD5 b90f697fe78519401b35aea10d8fc7e1
BLAKE2b-256 ca9dc894d338d8d33c7ac266e9d086850bf02044cc431c53ed1edf2d3712be9a

See more details on using hashes here.

File details

Details for the file tinyshift-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: tinyshift-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 14.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.10.14

File hashes

Hashes for tinyshift-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 41fd33f733e31e0c949f9dd3652704e712bd74bee52176a79e04ff744ab800c2
MD5 f147d75db19a5f87c58446c3668aa996
BLAKE2b-256 433de455a4e279f729e6f561fabc3adf9f9c6b7b80c390e79a7e6841fdf5be52

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page