A lightweight package for computing confidence intervals for classification tasks using conformal prediction and Pearson residuals.
Project description
💡 Pearsonify
Probabilistic Classification with Conformalized Intervals
Pearsonify is a lightweight 🐍 Python package for generating classification intervals around predicted probabilities in binary classification tasks.
It uses Pearson residuals and principles of conformal prediction to quantify uncertainty without making strong distributional assumptions.
🚀 Why Pearsonify?
- 📊 Intuitive Classification Intervals: Get reliable intervals for binary classification predictions.
- 🧠 Statistically Grounded: Uses Pearson residuals, a well-established metric from classical statistics.
- ⚡ Model-Agnostic: Works with any model that provides probability estimates.
- 🛠️ Lightweight: Minimal dependencies, easy to integrate into existing projects.
📦 How to install?
Use pip to install the package from GitHub:
pip install pearsonify
# or from GitHub:
pip install git+https://github.com/xRiskLab/pearsonify.git
💻 How to use?
import numpy as np
from pearsonify import Pearsonify
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
# Generate synthetic classification data
np.random.seed(42)
X, y = make_classification(
n_samples=1000, n_features=20, n_informative=10, n_classes=2, random_state=42
)
# Split data into train, calibration, and test sets
X_train, X_temp, y_train, y_temp = train_test_split(X, y, test_size=0.4, random_state=42)
X_cal, X_test, y_cal, y_test = train_test_split(X_temp, y_temp, test_size=0.5, random_state=42)
# Initialize Pearsonify with an SVC model
clf = SVC(probability=True, random_state=42)
model = Pearsonify(estimator=clf, alpha=0.05)
# Fit the model on training and calibration sets
model.fit(X_train, y_train, X_cal, y_cal)
# Generate prediction intervals for test set
y_test_pred_proba, lower_bounds, upper_bounds = model.predict_intervals(X_test)
# Calculate coverage
coverage = model.evaluate_coverage(y_test, lower_bounds, upper_bounds)
print(f"Coverage: {coverage:.2%}")
# Plot the intervals
model.plot_intervals(y_test_pred_proba, lower_bounds, upper_bounds)
Running example.py will generate the following plot:
This plot shows predicted probabilities with 95% confidence intervals, sorted by prediction score.
📖 References
Hosmer, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied Logistic Regression. John Wiley & Sons.
Tibshirani, R. (2023). Conformal Prediction. Advanced Topics in Statistical Learning, Spring 2023.
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pearsonify-0.1.0.tar.gz.
File metadata
- Download URL: pearsonify-0.1.0.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
21ddb3ed0c7bad050ee5f03a2c467441328feae742d1ca45e7dc6297bcaa55d7
|
|
| MD5 |
04d82dbfa1925dbb3618535902a7395a
|
|
| BLAKE2b-256 |
d42e02de300756b98ed92657e433552e7c25b24785d5362c55717a53d96ed2c8
|
File details
Details for the file pearsonify-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pearsonify-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af714d2d4d8b22fc503a622a0a1a1b192e5ea725b5662f27b51d02d0fcecf61a
|
|
| MD5 |
e162876feb4bf0e49afda4440eb1749d
|
|
| BLAKE2b-256 |
25cc4c6d775694d8a138988e8c8c6f7e01890ffaa710befc41484e0d0781c062
|