Skip to main content

A library for discovering underperforming segments in machine learning models.

Project description

PySliceKit 🍰

PyPI version Tests Docs License: MIT GitHub stars

PySliceKit is a Python library that helps you automatically discover exactly where your machine learning models are secretly failing.

Global metrics like 95% Accuracy or a low RMSE can be dangerously misleading. Your model might perform perfectly on the majority of your data, while severely underperforming on specific subgroups (e.g., specific age groups, geographic regions, or combined minority segments).

PySliceKit solves this by automatically "slicing" your feature dataset, evaluating your model's performance on every single subgroup, calculating statistical significance, and returning a beautiful heatmap and bar chart of your model's worst-performing blind spots.


📚 Full Documentation

Read the full PySliceKit Documentation here for the Getting Started guide, complete User Guide, and comprehensive API Reference.

🚀 Quick Start

Installation

pip install pyslicekit

Usage

import pandas as pd
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import pyslicekit

# 1. Load data and train your model
data = load_breast_cancer(as_frame=True)
df = data.frame
X = df.drop(columns=['target'])
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
model = LogisticRegression(max_iter=5000)
model.fit(X_train, y_train)

# 2. Let PySliceKit find the blind spots!
results = pyslicekit.evaluate(
    model=model,
    df=X_test,
    y_true=y_test,
    y_pred=model.predict(X_test),
    slice_cols=["mean radius", "mean texture"],
    metric="f1"
)

What does this do?

When you call pyslicekit.evaluate(), the library automatically chunks your test dataset into subgroups (like specific age brackets and regions), evaluates your model's F1 score on each subgroup independently, and applies statistical tests (like a Z-Test or Fisher's Exact) to prove if the performance drop is mathematically significant.

What does the output show?

You will automatically receive two powerful visualizations. The Heatmap shows the raw gaps across all subgroups—the darker the red, the worse your model performs compared to its global average.

Heatmap Visualization

The Worst Segments Bar Chart automatically extracts and sorts the worst offenders. Statistically significant failures are solid red, while non-significant drops are faded, giving you a perfectly prioritized list of areas to fix.

Bar Chart Visualization

Features

  • Model Agnostic: Works with any model that has a .predict() method (scikit-learn, XGBoost, PyTorch, etc.).
  • Automatic Statistics: Chooses between Z-tests, Fisher's Exact Test, and Bootstrap CIs based on your task type and sample sizes.
  • Beautiful Visualizations: Instantly generates heatmaps and bar charts of your model's weak spots.
  • Exporters: Easily export findings to JSON or CSV for audits or dashboards.

License

Licensed under the MIT License (See GitHub for details).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyslicekit-0.1.0.tar.gz (25.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyslicekit-0.1.0-py3-none-any.whl (24.2 kB view details)

Uploaded Python 3

File details

Details for the file pyslicekit-0.1.0.tar.gz.

File metadata

  • Download URL: pyslicekit-0.1.0.tar.gz
  • Upload date:
  • Size: 25.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for pyslicekit-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cb13b06efd96196789eee7f478c9a61cee37810947cbf7f1da0a2f15082ecfe7
MD5 8aef681287b34c1c606d549d230312f8
BLAKE2b-256 0f6cfca90bfbb3a7d550cf0c170904572027e9c3cbeb293ab78450e695c1c81e

See more details on using hashes here.

File details

Details for the file pyslicekit-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pyslicekit-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for pyslicekit-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1b6fb8ac05cc4749852ebb9ec874fe18844799088fc1371c0d0a491c6597652e
MD5 634f55ccfafe1e704bd2e031bb18a3a9
BLAKE2b-256 f101114efeb04571e31af86b692e3f11fc3080bf9ee609b5c55b06d04771ff0c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page