Skip to main content

MLReview model evaluation, inspection, diagnostics, and reporting for scikit-learn estimators

Project description

MLReview

MLReview provides model evaluation, inspection, diagnostics, and reporting for scikit-learn estimators. It replaces the extended-sklearn-metrics distribution with a smaller canonical API and a review-oriented result model.

Install

pip install ml-review

SHAP inspection is optional:

pip install ml-review[shap]

MLReview uses three spellings deliberately:

Surface Name
Product name MLReview
PyPI distribution ml-review
Python import package ml_review

Quick Start

from ml_review import evaluate
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

X, y = make_classification(n_samples=500, n_features=8, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

model = RandomForestClassifier(random_state=42).fit(X_train, y_train)
review = evaluate(model, X_train, y_train, X_test, y_test)

summary = review.report()
importance = review.importance.report()

evaluate(...) returns ReviewResult. User-facing reports are DataFrames, while rich result state remains available through mapping reads and raw export:

review["performance"]
review.get("feature_importance")
raw = review.to_dict()

Canonical API

Keep the top-level import small:

from ml_review import evaluate
from ml_review.evaluation import Thresholds, classification_cv, regression_cv

Use namespace imports for grouped workflows:

Area Import
Evaluation reports from ml_review.reporting import evaluation, fairness
Importance and SHAP from ml_review.inspection import importance, shap
ROC and PR review from ml_review.metrics import roc
Residual diagnostics from ml_review.diagnostics import residual
Performance plots from ml_review.plotting import performance
ROC plots from ml_review.plotting import roc_plot

Result Workflows

review.report()
review.print_summary()

review.importance.report()
review.importance.plot()

review.fairness.report()
review.fairness.plot()

Methods are thin wrappers around shared functions, so the functional form also works:

from ml_review.inspection import importance
from ml_review.reporting import evaluation

evaluation.report(review)
importance.report(review)

Cross-validation summary APIs still return DataFrames directly:

from ml_review.evaluation import classification_cv, regression_cv

classification_table = classification_cv(classifier, X, y, cv=5)
regression_table = regression_cv(regressor, X, y, cv=5)

Optional SHAP Inspection

SHAP is preferred by evaluate(...) when it is installed. If it is missing in the default auto mode, MLReview keeps built-in and permutation importance behavior. Use shap_mode="on" to require SHAP or shap_mode="off" to skip it.

review = evaluate(
    model,
    X_train,
    y_train,
    X_test,
    y_test,
    shap_mode="auto",
    shap_background_size=100,
    shap_sample_size=200,
)
from ml_review.inspection import shap

shap_table = review.shap.report()
review.shap.plot_importance()
review.shap.plot_explanation(sample_index=0)

# Equivalent functional calls
shap.report(review)
shap.plot_importance(review)

SHAP guidance in the report:

  • the baseline is the model output before feature contributions are added
  • positive SHAP values raise the explained output and negative values lower it
  • global SHAP importance averages absolute contributions across sampled rows
  • local SHAP plots explain one stored sampled prediction
  • classifier local plots default to the predicted class unless an output index is provided

MLReview stores bounded serialized SHAP payloads in review data rather than native SHAP explanation objects.

ROC and Residual Review

ROC functions receive an estimator and data, matching the existing scikit-learn-oriented implementation:

from ml_review.metrics import roc

roc_result = roc.binary(classifier, X, y, cv=5)
roc_table = roc_result.report()
thresholds = roc_result.thresholds()

pr_result = roc.precision_recall(classifier, X, y, cv=5)
pr_table = pr_result.report()

Residual diagnostics also use a result object with DataFrame reports:

from ml_review.diagnostics import residual

residual_result = residual.calculate(regressor, X, y, cv=5)
residual_table = residual_result.report()
residual_result.plot()

Migration

extended_sklearn_metrics remains importable from the new distribution as a warning-bearing compatibility path. New releases are published as ml-review.

Legacy API Canonical MLReview API
final_model_evaluation(...) evaluate(...)
evaluate_model_with_cross_validation(...) regression_cv(...)
evaluate_classification_model_with_cross_validation(...) classification_cv(...)
CustomThresholds(...) Thresholds(...)
create_evaluation_report(results) review.report()
create_feature_importance_report(results) review.importance.report()
create_feature_importance_plot(results) review.importance.plot()
create_fairness_report(results) review.fairness.report()
calculate_roc_metrics(...) roc.binary(...)
calculate_residual_diagnostics(...) residual.calculate(...)

See MIGRATION.md for import examples and compatibility notes.

Roadmap

The MLReview foundation release covers the package rename, namespace cleanup, result objects, DataFrame-first reports, compatibility shims, and optional SHAP inspection. Planned later feature releases cover:

  1. calibration and probability reliability review
  2. unsupervised review for clustering, PCA, and embeddings
  3. deeper inspection, fairness checks, and report artifacts

See ROADMAP.md for the feature pipeline and guardrails.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml_review-0.4.1.tar.gz (63.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml_review-0.4.1-py3-none-any.whl (81.9 kB view details)

Uploaded Python 3

File details

Details for the file ml_review-0.4.1.tar.gz.

File metadata

  • Download URL: ml_review-0.4.1.tar.gz
  • Upload date:
  • Size: 63.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ml_review-0.4.1.tar.gz
Algorithm Hash digest
SHA256 88b10bc03fdf976e3b4303eddff31f49a2fedd311fa9ffae2a3441bde3d28364
MD5 25136a91c9069f85cb549f812311b27b
BLAKE2b-256 db873fb5b46514dc42ab32511742e1fdde6fd9841b8c3a59639d111ce4395eb5

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml_review-0.4.1.tar.gz:

Publisher: publish.yml on SubaashNair/ml-review

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ml_review-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: ml_review-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 81.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ml_review-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5066b2a3dfc77a4ca370de9351a2b01580719a792fee3a30c4433ef921eb79f9
MD5 71f8d41088b7c43ee8d9b06fa7ae54ad
BLAKE2b-256 230d0fd9c7b953a0f7310c5addb9dbea1f84a15213847adedab0583e454a212d

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml_review-0.4.1-py3-none-any.whl:

Publisher: publish.yml on SubaashNair/ml-review

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page