Skip to main content

Honest, production-readiness evaluation for imbalanced classification models.

Project description

imbeval

Honest production-readiness evaluation for imbalanced classification models.

Standard metric libraries hand you precision/recall/F1 and stop there. On imbalanced data (fraud, churn, medical diagnosis, anomaly detection, rare-event prediction) that's not enough to know if a model is actually safe to ship. imbeval answers the real question: is this model usable in production, and at what threshold?

It combines three things most teams check manually and inconsistently:

  1. Minority-class performance — not buried inside macro-averages.
  2. Calibration quality — is the model's confidence trustworthy, or just confidently wrong?
  3. Threshold tuning — the default 0.5 threshold is almost always wrong on imbalanced data; imbeval finds a better one, optionally weighted by real business cost (cost of a false positive vs a false negative).

Install

pip install imbeval

(Once published — see the publishing guide if you're building this from source.)

Quickstart

from imbeval import evaluation_report

# y_true: ground truth labels (0/1)
# y_pred_proba: predicted probability of the positive class, from model.predict_proba(X)[:, 1]
report = evaluation_report(
    y_true,
    y_pred_proba,
    cost_fp=1,     # cost of a false alarm
    cost_fn=25,    # cost of missing a true positive (e.g. missed fraud)
)

print(report["verdict"])
print(report["minority_class"])
print(report["optimal_f1_threshold"])
print(report["cost_sensitive_threshold"])

Example output:

Not yet production-ready: minority-class recall is below 50% at the default 0.5 threshold;
default 0.5 threshold is far from optimal; consider using optimal_f1_threshold.

What's inside

Function What it does
evaluation_report(y_true, y_pred_proba, ...) One combined report + plain-English verdict
minority_class_report(y_true, y_pred) Precision/recall/F1 focused on the minority class
per_class_confidence(y_true, y_pred_proba) Mean model confidence per true class
calibration_score(y_true, y_pred_proba) Expected Calibration Error (ECE)
reliability_curve(y_true, y_pred_proba) Data for plotting a reliability diagram
optimal_threshold(y_true, y_pred_proba) Best decision threshold by F1
cost_sensitive_threshold(y_true, y_pred_proba, cost_fp, cost_fn) Best threshold by real business cost

Full API reference: docs/api.md Usage guide and recipes: docs/usage.md Publishing this package yourself: docs/publishing.md

Why this exists

Most "imbalanced learning" tools (e.g. imbalanced-learn) focus on fixing the data (SMOTE and friends). imbeval focuses on the other end of the pipeline: telling you honestly whether the model you already trained is good enough, and at what threshold, once class imbalance is in play. It's meant to sit right before you ship.

Status

Early (v0.1.0). The core API (evaluation_report, threshold tools, calibration tools) is stable for binary classification. Multi-class support is on the roadmap — see CHANGELOG.md.

Contributing

Issues and PRs welcome once the repo is public. See docs/usage.md for how the modules fit together if you want to extend it.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imbeval-0.1.0.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imbeval-0.1.0-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file imbeval-0.1.0.tar.gz.

File metadata

  • Download URL: imbeval-0.1.0.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for imbeval-0.1.0.tar.gz
Algorithm Hash digest
SHA256 44926da81ee6fccd4a40c2980d4f71ffb0e157fb34646347ba9af040478ab3e6
MD5 40a64c3a20611789d2e38e9ef99cd766
BLAKE2b-256 005cc77ad523b9e5686c0beaa5dae9e4b9b5e0d94e6bc40ba826500792924edd

See more details on using hashes here.

File details

Details for the file imbeval-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: imbeval-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for imbeval-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e80b0d20e5a24cfcaa33bb08fb12b91ad859b20d2bb50ff357083333276805b4
MD5 e991ed232028c7831aa34227050d12f1
BLAKE2b-256 0e92f6b61f8e381e79dfc36a47037f5cb98183b632f8a5e4d1f6d340836d0571

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page