A small toolbox for mlops

These details have not been verified by PyPI

Project description

TinyShift

tinyshift_full_logo

TinyShift is a lightweight, sklearn-compatible Python library designed for data drift detection, outlier identification, and MLOps monitoring in production machine learning systems. The library provides modular, easy-to-use tools for detecting when data distributions or model performance change over time, with comprehensive visualization capabilities.

For enterprise-grade solutions, consider Nannyml.

Features

Data Drift Detection: Categorical and continuous data drift monitoring with multiple distance metrics
Outlier Detection: HBOS, PCA-based and SPAD outlier detection algorithms
Time Series Analysis: Seasonality decomposition, trend analysis, forecasting diagnostics, and forecast stabilization
Forecast Stability: Metrics and interpolation methods for stable forecasting

Technologies Used

Python 3.10+
Scikit-learn 1.3.0+
Pandas 2.3.0+
NumPy
SciPy
Statsmodels 0.14.5+
Plotly 5.22.0+ (optional, for plotting)

📦 Installation

Install TinyShift using pip:

pip install tinyshift

Development Installation

Clone and install from source:

git clone https://github.com/HeyLucasLeao/tinyshift.git
cd tinyshift
pip install -e .

📖 Quick Start

1. Categorical Data Drift Detection

TinyShift provides sklearn-compatible drift detectors that follow the familiar fit() and score() pattern:

import pandas as pd
from tinyshift.drift import CatDrift

# Load your data
df = pd.read_csv("data.csv")
reference_data = df[df["date"] < '2024-07-01']
analysis_data = df[df["date"] >= '2024-07-01'] 

# Initialize and fit the drift detector
detector = CatDrift(
    freq="D",                    # Daily frequency
    func="chebyshev",           # Distance metric
    drift_limit="auto",         # Automatic threshold detection
    method="expanding"          # Comparison method
)

# Fit on reference data
detector.fit(reference_data)

# Score new data for drift
drift_scores = detector.predict(analysis_data)
print(drift_scores)

Available distance metrics for categorical data:

"chebyshev": Maximum absolute difference between distributions
"jensenshannon": Jensen-Shannon divergence
"psi": Population Stability Index

2. Continuous Data Drift Detection

For numerical features, use the continuous drift detector:

from tinyshift.drift import ConDrift

# Initialize continuous drift detector
detector = ConDrift(
    freq="W",                   # Weekly frequency  
    func="ws",                  # Wasserstein distance
    drift_limit="auto",
    method="expanding"
)

# Fit and score
detector.fit(reference_data)
drift_predicts = detector.predict(analysis_data)

3. Outlier Detection

TinyShift includes sklearn-compatible outlier detection algorithms:

from tinyshift.outlier import SPAD, HBOS, PCAReconstructionError

# SPAD (Simple Probabilistic Anomaly Detector)
spad = SPAD(plus=True)
spad.fit(X_train)

outlier_scores = spad.decision_function(X_test)
outlier_labels = spad.predict(X_test)

# HBOS (Histogram-Based Outlier Score)
hbos = HBOS(dynamic_bins=True)
hbos.fit(X_train, nbins="fd")
scores = hbos.predict(X_test)

# PCA-based outlier detection
pca_detector = PCAReconstructionError()
pca_detector.fit(X_train)
pca_scores = pca_detector.predict(X_test)

4. Time Series Analysis and Diagnostics

TinyShift provides comprehensive time series analysis capabilities:

from tinyshift.plot import seasonal_decompose
from tinyshift.series import (
    trend_significance, 
    foreca, 
    sample_entropy,
    permutation_entropy,
    theoretical_limit,
    hurst_exponent,
    hampel_filter,
    bollinger_bands
)

seasonal_decompose(
    time_series, 
    periods=[7, 365],  # Weekly and yearly patterns
    width=1200, 
    height=800
)

# Test for significant trends
r_squared, p_value = trend_significance(time_series)

# Assess forecastability
forecastability = foreca(time_series)
print(f"Forecastability (Omega): {forecastability}")

# Measure complexity and regularity
complexity = sample_entropy(time_series, m=2, tolerance=0.2)
print(f"Sample Entropy: {complexity}")

# Measure ordinal complexity
perm_entropy = permutation_entropy(time_series, m=3, delay=1, normalize=True)
print(f"Permutation Entropy: {perm_entropy}")

# Calculate theoretical predictability limit
theo_limit = theoretical_limit(time_series, m=3, delay=1)
print(f"Theoretical Limit (Πmax): {theo_limit}")

# Detect long-term memory
hurst, p_value = hurst_exponent(time_series)
print(f"Hurst Exponent: {hurst}, P-value: {p_value}")

# Outlier detection in time series
outliers = hampel_filter(time_series, window_size=5)
outliers = bollinger_bands(time_series, window_size=20)

# Plot lag analysis with PAMI (Permutation Auto-Mutual Information)
from tinyshift.plot import pami
pami(time_series, nlags=20, m=3, delay=1, normalize=False)

5. Forecast Stability and Interpolation

TinyShift includes forecast stability metrics and interpolation methods:

from tinyshift.series import (
    macv, mach,           # Mean Absolute Change metrics
    mascv, masch,         # Mean Absolute Scaled Change metrics
    rmsscv, rmssch,       # Root Mean Squared Scaled Change metrics
    vi, hpi, hfi          # Interpolation methods
)

# Calculate forecast stability metrics
vertical_stability = macv(y_hat, y_hat_t_minus_1)
horizontal_stability = mach(y_hat) 

# Scaled stability metrics
scaled_v_stability = mascv(y_train, y_hat, y_hat_t_minus_1, seasonality=12)
scaled_h_stability = masch(y_train, y_hat, seasonality=12)

# Apply forecast stabilization techniques
# Vertical Interpolation
stable_forecast = vi(y_hat, anchor, w_s=0.3)

# Horizontal Partial Interpolation
smooth_forecast = hpi(y_hat, w_s=0.4)

# Horizontal Full Interpolation
fully_stable_forecast = hfi(y_hat, w_s=0.5)

6. Advanced Modeling Tools

from tinyshift.modelling import filter_features_by_vif
from tinyshift.stats import bootstrap_bca_interval

#Residualizer
residualizer = FeatureResidualizer()
residualizer.fit(X_train[preprocess_columns], corrcoef=0.70)

#Train
X_train = X_train.astype({x: float for x in preprocess_columns})
X_train.loc[:, preprocess_columns] = residualizer.transform(X_train[preprocess_columns])

# Detect multicollinearity
mask = filter_features_by_vif(X_train, trehshold=5, verbose=True)
X_train.columns = X_train.columns[mask]
X_test.columns = X_test.columns[mask]

#Test
X_test = X_test.astype({x: float for x in preprocess_columns})
X_test.loc[:, preprocess_columns] = residualizer.transform(X_test[preprocess_columns])

# Bootstrap confidence intervals
confidence_interval = bootstrap_bca_interval(
    data, 
    statistic=np.mean, 
    alpha=0.05, 
    n_bootstrap=1000
)

📁 Project Structure

tinyshift/
├── association_mining/          # Market basket analysis tools
│   └── README.md              # Module documentation
│   ├── analyzer.py             # Transaction pattern analysis
│   └── encoder.py              # Data encoder
├── drift/                      # Data drift detection 
│   └── README.md              # Module documentation
│   ├── base.py                 # Base drift detection classes  
│   ├── categorical.py          # CatDrift for categorical features
│   └── continuous.py           # ConDrift for numerical features
├── examples/                   # Jupyter notebook examples
│   ├── decomp_mstl_ml.ipynb   # MSTL decomposition and ML examples
│   ├── drift.ipynb            # Drift detection examples
│   ├── outlier.ipynb          # Outlier detection demos
│   ├── series.ipynb           # Time series analysis
│   ├── transaction_analyzer.ipynb # Transaction analysis examples
│   └── ts_diagnostics.ipynb   # Time series diagnostics
├── modelling/                  # ML modeling utilities
│   ├── README.md              # Module documentation
│   ├── multicollinearity.py   # VIF-based multicollinearity detection
│   ├── residualizer.py        # Residualizer Feature
│   └── scaler.py              # Custom scaling transformations
├── outlier/                    # Outlier detection algorithms
│   └── README.md              # Module documentation
│   ├── base.py                 # Base outlier detection classes
│   ├── hbos.py                 # Histogram-Based Outlier Score
│   ├── pca.py                  # PCA-based outlier detection  
│   └── spad.py                 # Simple Probabilistic Anomaly Detector
├── plot/                       # Visualization capabilities  
│   ├── README.md              # Module documentation
│   ├── correlation.py          # Correlation analysis plots
│   └── diagnostic.py           # Time series diagnostics plots
├── series/                     # Time series analysis tools
│   └── README.md              # Module documentation
│   ├── forecastability.py     # Forecast quality and complexity metrics
│   ├── interpolation.py       # Forecast stabilization methods
│   ├── outlier.py             # Time series outlier detection
│   ├── stability.py           # Forecast stability metrics
│   └── stats.py               # Statistical analysis functions
└── stats/                      # Statistical utilities
    ├── bootstrap_bca.py        # Bootstrap confidence intervals
    ├── statistical_interval.py # Statistical interval estimation
    └── utils.py               # General statistical utilities

Development Setup

git clone https://github.com/HeyLucasLeao/tinyshift.git
cd tinyshift
pip install -e ".[all]"

📋 Requirements

Python: 3.10+
Core Dependencies:
- pandas (>2.3.0)
- scikit-learn (>1.3.0)
- statsmodels (>=0.14.5)
Optional Dependencies:
- plotly (>5.22.0) - for visualization
- kaleido (<=0.2.1) - for static plot export
- nbformat (>=5.10.4) - for notebook support

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Inspired by Nannyml

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.2.3

Jan 9, 2026

1.2.2

Jan 5, 2026

1.2.1

Nov 29, 2025

This version

1.2.0

Nov 21, 2025

1.1.2

Nov 4, 2025

1.1.1

Nov 4, 2025

1.1.0

Nov 4, 2025

1.0.1

Nov 2, 2025

1.0.0

Nov 1, 2025

0.8.9

Oct 27, 2025

0.8.8

Oct 27, 2025

0.8.7

Oct 27, 2025

0.8.6

Oct 22, 2025

0.8.5

Oct 22, 2025

0.8.4

Oct 22, 2025

0.8.3

Oct 22, 2025

0.8.2

Oct 22, 2025

0.8.1

Oct 21, 2025

0.8.0

Oct 20, 2025

0.7.6

Oct 18, 2025

0.7.5

Oct 12, 2025

0.7.4

Oct 4, 2025

0.7.3

Oct 4, 2025

0.7.2

Oct 2, 2025

0.7.1

Oct 2, 2025

0.7.0

Oct 2, 2025

0.6.0

Sep 28, 2025

0.5.2

Sep 28, 2025

0.5.1

Aug 31, 2025

0.5.0

Aug 30, 2025

0.4.0

Aug 29, 2025

0.3.3

Aug 4, 2025

0.3.2

Aug 4, 2025

0.3.1

Aug 4, 2025

0.3.0

Aug 3, 2025

0.2.1

Aug 2, 2025

0.2.0

Jul 30, 2025

0.1.4

Jul 27, 2025

0.1.3

Jul 27, 2025

0.1.2

Jul 26, 2025

0.1.1

Jul 23, 2025

0.1.0

Jul 22, 2025

0.0.9

Apr 28, 2025

0.0.8

Apr 25, 2025

0.0.7

Apr 21, 2025

0.0.6

Jan 17, 2025

0.0.5

Jan 11, 2025

0.0.4

Jan 10, 2025

0.0.3

Jan 7, 2025

0.0.2

Jan 6, 2025

0.0.1

Jan 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinyshift-1.2.0.tar.gz (55.7 kB view details)

Uploaded Nov 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tinyshift-1.2.0-py3-none-any.whl (70.9 kB view details)

Uploaded Nov 21, 2025 Python 3

File details

Details for the file tinyshift-1.2.0.tar.gz.

File metadata

Download URL: tinyshift-1.2.0.tar.gz
Upload date: Nov 21, 2025
Size: 55.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for tinyshift-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`e703760fe265ea3bd445de937b753e879b75835227ed60e433f8fa22d7729b67`
MD5	`173feb2b04804ce16b31e27366f6eefc`
BLAKE2b-256	`82509085f76f27a657b97e6b51498e3cab74ece2829c86583a397fe1b0307f92`

See more details on using hashes here.

File details

Details for the file tinyshift-1.2.0-py3-none-any.whl.

File metadata

Download URL: tinyshift-1.2.0-py3-none-any.whl
Upload date: Nov 21, 2025
Size: 70.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for tinyshift-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1ecd19cc479a80743c5dfed9f7961d10a530481d42d26186cb7928572c879760`
MD5	`f5101ba3bafd899232658dae85c6d1c8`
BLAKE2b-256	`e3536f0855fee303f80e3112da120c49c317f40863fd39b1b2a4787d6c8369f1`

See more details on using hashes here.

tinyshift 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

TinyShift

Features

Technologies Used

📦 Installation

Development Installation

📖 Quick Start

1. Categorical Data Drift Detection

2. Continuous Data Drift Detection

3. Outlier Detection

4. Time Series Analysis and Diagnostics

5. Forecast Stability and Interpolation

6. Advanced Modeling Tools

📁 Project Structure

Development Setup

📋 Requirements

📄 License

🙏 Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes