A strict experimental harness for reproducible, statistically valid model evaluation.
Project description
statbelt
statbelt is a Python package for reproducible, statistically aware model evaluation.
Current release status: Alpha.
Supported Python versions: 3.11+.
v0 Features
ExperimentalHarnessbuilder-style API for binary classification.- Deterministic stratified k-fold evaluation with shared folds across models.
- Bootstrap confidence intervals over fold-level metrics.
- Lock artifact output (
statbelt.lock.json) containing config and split indices. - Strict staged workflow: configure ->
fasten()->evaluate(). - CLI smoke entry point (
statbelt) for environment checks.
Supported v0 Task
binary_classification
Supported v0 Metrics
accuracyprecisionrecallf1roc_auclog_loss
Validation is fail-fast. For example, log_loss requires predict_proba, and roc_auc
requires predict_proba or decision_function.
Installation
Install from PyPI:
pip install statbelt
For local development:
uv sync --all-groups
Quick Start
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from statbelt import ExperimentalHarness
X, y = make_classification(n_samples=120, random_state=21)
report = (
ExperimentalHarness()
.data(X, y)
.task("binary_classification")
.compare(
("logreg", LogisticRegression(max_iter=500)),
("rf", RandomForestClassifier(n_estimators=25, random_state=21)),
)
.metrics("accuracy", "roc_auc", "log_loss")
.design(cv=5, random_state=42)
.inference(alpha=0.05, bootstrap_resamples=2000)
.fasten("statbelt.lock.json")
.evaluate()
)
print(report.summary())
CLI Smoke Check
statbelt
Expected output:
Hello from statbelt!
Development
uv sync --all-groups
uv run ruff check .
uv run pytest
uv run pytest includes a terminal coverage report (missing lines included) via
pytest-cov.
Current Limits
- Binary classification only.
- Confidence intervals only (no pairwise hypothesis tests yet).
- Python API is the primary surface in v0; CLI remains minimal.
License
This project is licensed under the GNU Affero General Public License, version 3
or later (AGPL-3.0-or-later). See LICENSE.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file statbelt-0.1.0.tar.gz.
File metadata
- Download URL: statbelt-0.1.0.tar.gz
- Upload date:
- Size: 33.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d1bff87b036a5b595e889a1b7e7e3fa28a3d5688700b2541cffc3d435e624ea
|
|
| MD5 |
88ba76b4a3d61b5b47b769607f4e11b4
|
|
| BLAKE2b-256 |
aba56c5c27684eab89a5f20b60fd9794577102432eaf830eff3234581f2f9de4
|
File details
Details for the file statbelt-0.1.0-py3-none-any.whl.
File metadata
- Download URL: statbelt-0.1.0-py3-none-any.whl
- Upload date:
- Size: 23.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d44d984a3847e7699016c9abc294553721bf634063b9d3f9960f84a43314e598
|
|
| MD5 |
260cb4f8286f1e1ccdf8fde153deb3d2
|
|
| BLAKE2b-256 |
0cb8bab86b6219d7b998f01f5773e11ce2510e3c0d8d011478cbdb61d583d0db
|