Skip to main content

A Python package for extended scikit-learn model evaluation metrics

Project description

extended-sklearn-metrics

A Python library for evaluating scikit-learn regression models with comprehensive metrics and interpretable results.

Features

  • Cross-validation based model evaluation
  • Automatic calculation of RMSE, MAE, R², and Explained Variance
  • Error percentage calculations relative to target variable range
  • Performance classification (Excellent, Good, Moderate, Poor)
  • Easy-to-read summary tables

Installation

From PyPI

pip install extended-sklearn-metrics

From Source

  1. Clone this repository:
git clone https://github.com/SubaashNair/extended-sklearn-metrics.git
cd extended-sklearn-metrics
  1. Install dependencies:
pip install -r requirements.txt

Usage

Here's a simple example using the California Housing dataset:

from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from extended_sklearn_metrics import evaluate_model_with_cross_validation

# Load and prepare data
housing = fetch_california_housing(as_frame=True)
X_train, X_test, y_train, y_test = train_test_split(
    housing.data, housing.target, test_size=0.2, random_state=42
)

# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)

# Create and evaluate model
model = LinearRegression()
target_range = y_train.max() - y_train.min()

# Get performance metrics
performance_table = evaluate_model_with_cross_validation(
    model=model,
    X=X_train_scaled,
    y=y_train,
    cv=5,
    target_range=target_range
)

print(performance_table)

Output Format

The library generates a DataFrame with the following columns:

Column Description
Metric Name of the metric (RMSE, MAE, R², etc.)
Value Computed value of the metric
Threshold Thresholds used for performance classification
Calculation Formula/method used to compute the metric
Performance Classification (Excellent, Good, Moderate, Poor)

Performance Thresholds

RMSE and MAE

  • < 10% of range: Excellent
  • 10%–20% of range: Good
  • 20%–30% of range: Moderate
  • 30% of range: Poor

R² and Explained Variance

  • 0.7: Good

  • 0.5–0.7: Acceptable
  • < 0.5: Poor

Requirements

  • Python 3.9+
  • pandas >= 2.0.0
  • scikit-learn >= 1.0.0
  • numpy >= 1.20.0

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extended_sklearn_metrics-0.1.8.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

extended_sklearn_metrics-0.1.8-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file extended_sklearn_metrics-0.1.8.tar.gz.

File metadata

File hashes

Hashes for extended_sklearn_metrics-0.1.8.tar.gz
Algorithm Hash digest
SHA256 b7bd174a59ddae080046ba712b57a9ecd7f828f509ddb622088ac901e1109305
MD5 0bbddea64504b6399868925feb6ff8d4
BLAKE2b-256 d4426703cb61a98cd62f117d22a28e45c712ce8e97f6913fa08842c47533f925

See more details on using hashes here.

File details

Details for the file extended_sklearn_metrics-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for extended_sklearn_metrics-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 5764684baee17d4094d1bf2aff28aa5b23390f9ff3b59742439429a363f01898
MD5 6f25a42d18dd21a7fb9e20aa80e24785
BLAKE2b-256 1fef89f34a2c9b4488e652ccd9f494531ddd8c6fedec70cc58522f002c0a40cb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page