Skip to main content

Scenario-first ML evaluation engine — stress-test your models to find where metrics lie

Project description

Spectra

CI Python 3.11+ License: Apache 2.0

Scenario-first ML evaluation engine. Stress-test your models to find where metrics lie.

Spectra runs your model through realistic failure scenarios (label noise, score noise, class imbalance, threshold gaming) and shows you exactly where your metrics break down. Instead of a single accuracy number, you get a transparent stress-test report.

Install

pip install spectra-ml

With web UI support:

pip install spectra-ml[web]

Quick Start

Python SDK

import metrics_lie as spectra

result = spectra.evaluate(
    name="my-model-audit",
    dataset="data.csv",
    model="model.pkl",
    metric="auc",
    trust_pickle=True,
)

spectra.display(result)

CLI

# Run from spec file
spectra run experiment.json

# Quick evaluation
spectra evaluate model.pkl --dataset data.csv --metric auc --trust-pickle

# Launch web UI
spectra serve

Web UI (Quick Test)

pip install spectra-ml[web]
spectra serve

Upload your model + dataset CSV. Spectra auto-detects columns, task type, and best metric. One click to run a full stress test.

What It Does

  1. Stress-tests metrics across scenarios: label noise, score noise, class imbalance, threshold gaming
  2. Detects metric disagreement — when accuracy says "great" but calibration says "broken"
  3. Runs diagnostics: calibration analysis, subgroup gaps, sensitivity ranking, threshold sweeps
  4. Produces decision scorecards with weighted components and transparent reasoning
  5. Compares models with regression detection and structured comparison reports

Supported

Category Options
Task Types Binary classification, multiclass, regression, ranking
Metrics 27 metrics: AUC, F1, precision, recall, Brier, ECE, MAE, RMSE, R2, NDCG, and more
Model Formats sklearn pickle, ONNX, PyTorch, TensorFlow, XGBoost, LightGBM, CatBoost, MLflow
Scenarios Label noise, score noise, class imbalance, threshold gaming

Architecture

spectra run / evaluate / serve
        |
  Core Engine (metrics_lie)
    |- Dataset Loading (CSV)
    |- Model Adapter (pickle, ONNX, PyTorch, ...)
    |- Scenario Runner (Monte Carlo trials)
    |- Metrics (27 metrics across 4 task types)
    |- Diagnostics (calibration, gaming, subgroups)
    |- Analysis (dashboard, disagreement, sensitivity)
    |- Decision Framework (scorecard, components)
    '- Artifacts (plots, reports)

Development

git clone https://github.com/StrangeStorm243-bit/when-metrics-lie.git
cd when-metrics-lie
python -m venv .venv && source .venv/bin/activate  # or .venv\Scripts\activate on Windows
pip install -e ".[dev,web]"
pytest

Documentation

Full docs: https://strangestorm243-bit.github.io/when-metrics-lie/

License

Apache 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spectra_ml-1.0.0.tar.gz (148.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spectra_ml-1.0.0-py3-none-any.whl (118.8 kB view details)

Uploaded Python 3

File details

Details for the file spectra_ml-1.0.0.tar.gz.

File metadata

  • Download URL: spectra_ml-1.0.0.tar.gz
  • Upload date:
  • Size: 148.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for spectra_ml-1.0.0.tar.gz
Algorithm Hash digest
SHA256 44aca381960a9ea33d3d12ea90c4d2da11263270f5df1bfd691dd8df88b652ff
MD5 07189fee65fe917cab97656c0236c7b7
BLAKE2b-256 2750f8648300425fc9191d0a218e8a3bfd734b7e2380fa0b46e11fa1b0d43a97

See more details on using hashes here.

File details

Details for the file spectra_ml-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: spectra_ml-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 118.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for spectra_ml-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9f67c6e7960eabf3593759286426252e902429e9fd8465a8198e2f2cf2baf131
MD5 dc85fc06e7c7bd58135dd722e9fc485a
BLAKE2b-256 a9009ed133c07fb664029bb13bc0db2e0c5c9c7b2950972d0b51edc791819408

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page