Skip to main content

Multiclass Metrics

Project description

Multiclass Metrics

CI

multiclass_metrics provides scikit-learn-style metrics for probabilistic binary and multiclass classifiers. Its main purpose is to make ROC-AUC and area under the precision-recall curve usable when the classes present in y_true and the columns returned by predict_proba do not line up exactly.

What It Provides

The package exposes two public metric functions:

  • multiclass_metrics.roc_auc_score(...)
  • multiclass_metrics.auprc(...)

Both accept y_true, y_score, optional labels, average, sample_weight, and multi_class arguments in the same general shape as scikit-learn metrics. They support binary inputs and multiclass inputs, with one-vs-one multiclass averaging by default.

Why It Exists

Scikit-learn's probabilistic multiclass metrics expect the score matrix and label set to agree. That can be awkward when, for example:

  • a classifier never learned to predict a class that appears in the test set
  • a score matrix includes classes that are absent from y_true
  • score columns are not naturally inferable from np.unique(y_true)

This package aligns those cases before scoring, so evaluation can proceed without manually rebuilding the probability matrix.

How It Works

When you pass labels explicitly and its length does not match the number of unique values in y_true, the shared multiclass wrapper aligns the score matrix before calling the underlying scikit-learn metric logic:

  • if y_true contains a class missing from labels/y_score, it inserts a score column filled with 0.0
  • if labels/y_score contains a class absent from y_true, it removes that score column
  • after class alignment, labels are sorted and score columns are reordered to match
  • binary inputs fall back to the standard binary scikit-learn metric behavior
  • multiclass inputs use multi_class="ovo" by default and also accept multi_class="ovr"

Unlike scikit-learn's multiclass ROC-AUC entry point, roc_auc_score here does not require each row of y_score to sum to 1.

Installation

pip install multiclass_metrics

The package requires Python 3.8 or newer and depends on NumPy, pandas, and scikit-learn.

Usage

import numpy as np
import multiclass_metrics

y_true = ["Healthy", "Ebola", "HIV", "Healthy", "Covid"]

# Columns are ordered as: Covid, HIV, Healthy.
# There is no Ebola column, so multiclass_metrics inserts it with 0.0 scores.
y_score = np.array(
    [
        [0.10, 0.10, 0.80],
        [0.33, 0.33, 0.34],
        [0.10, 0.80, 0.10],
        [0.05, 0.05, 0.90],
        [0.80, 0.10, 0.10],
    ]
)

auc = multiclass_metrics.roc_auc_score(
    y_true=y_true,
    y_score=y_score,
    labels=["Covid", "HIV", "Healthy"],
    multi_class="ovo",
    average="macro",
)

pr_auc = multiclass_metrics.auprc(
    y_true=y_true,
    y_score=y_score,
    labels=["Covid", "HIV", "Healthy"],
)

When labels are provided, y_score columns must be in the same order as labels. If labels are omitted, they are inferred from np.unique(y_true); for multiclass problems where y_score has columns that cannot be inferred from y_true, pass labels explicitly.

Important Behavior and Limitations

  • average is intended for "macro" or "weighted" averaging.
  • multi_class must be "ovo" or "ovr".
  • sample_weight is passed through for binary and OvR scoring, but is ignored for OvO multiclass scoring.
  • In binary mode, a 1D score vector is accepted; with two labels, the second label is treated as the positive class for auprc.
  • If class alignment leaves only one class in y_true, probability-based scoring is undefined and the function raises ValueError.

Development

pip install -r requirements_dev.txt
pip install -e .
make test
make lint
make docs

The repository includes pytest coverage for missing classes in y_score, missing classes in y_true, binary inputs, label ordering, non-normalized score rows, and auPRC behavior.

Changelog

0.0.1

  • First release on PyPI.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multiclass_metrics-0.0.2.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multiclass_metrics-0.0.2-py2.py3-none-any.whl (7.4 kB view details)

Uploaded Python 2Python 3

File details

Details for the file multiclass_metrics-0.0.2.tar.gz.

File metadata

  • Download URL: multiclass_metrics-0.0.2.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for multiclass_metrics-0.0.2.tar.gz
Algorithm Hash digest
SHA256 e9fa1aa809043b046d7445daf0924bf7bae7ebb7a4efbe9172d35bb0712876f4
MD5 de75e2daa138efb5a8d61957fe05ca5f
BLAKE2b-256 06d974e1df76966b6402613aabf0906a8c43ba70e7bd156d020bb50f954d267a

See more details on using hashes here.

File details

Details for the file multiclass_metrics-0.0.2-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for multiclass_metrics-0.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 8ebebc6ec0e9e00cb679a14fcbab515859b99f8fd1e546d0cf9cfa0503725089
MD5 55e1ed38b59d02f52ed0467b70ab27f5
BLAKE2b-256 431983a5b5dee288463cb9f524dd63bb15bce89e700dd266d4f44065363572d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page