Skip to main content

Extension to sklearn.metrics to allow metrics with multiple predictions.

Project description

toppred

Extension to sklearn.metrics to allow metrics for classifiers that output a top n prediction. Some classifiers output confidence levels for each class. Oftentimes, you want to evaluate the performance of such classifiers assuming the correct prediction is the top n predictions with the highest confidence level. This library serves as an extension to the functions provided by sklearn.metrics to allow for evaluating classifiers that do not output a single prediction per sample, but rather a range of top predictions per sample.

Installation

The most straightforward way of installing toppred is via pip:

pip3 install toppred

From source

To install this library from source, simply clone the repository:

git clone https://github.com/Thijsvanede/toppred.git

Next, ensure that all dependencies have been installed:

Using the requirements.txt file:

pip3 install -r /path/to/toppred/requirements.txt

Installing libraries independently:

pip3 install numpy pandas sklearn

Finally, install the library from source:

pip3 install -e /path/to/toppred/

Documentation

The main usage of this library is to compute metrics over the top-n predictions of a given classifier. In the normal case, a classifier gives a single prediction per sample, often in the form of an array:

import numpy as np

y_true = np.asarray([0, 1, 2, 1, 0]) # True labels
y_pred = np.asarray([0, 1, 1, 0, 0]) # Predicted labels

However, a classifier could also return the top n most likely predictions, e.g.:

import numpy as np

y_true = np.asarray([0, 1, 2, 1, 0]) # True labels
y_pred = np.asarray([                # Predicted labels
    [0, 1, 2],
    [1, 0, 2],
    [1, 2, 0],
    [0, 1, 2],
    [0, 1, 2],
])

In this case, we would like to be able to compute the performance when:

  1. The correct prediction is the most likely prediction (y_pred[:, 0])
  2. The correct prediction is in the top 2 most likely predictions (y_pred[:, :2])
  3. The correct prediction is in the top 3 most likely predictions (y_pred[:, :3])
  4. The correct prediction is in the top n most likely predictions (y_pred[:, :n])

For this purpose, this library provides two functions:

Probabilities

Some classifiers, including many neural networks do not give direct top n results, but instead provide a probability (confidence level) for each class, producing an output such as:

import numpy as np

y_true = np.asarray([0, 1, 2, 1, 0]) # True labels
y_prob = np.asarray([ # Prediction probability
    [0.7, 0.2, 0.1],  # class 0 -> 0.7, class 1 -> 0.2, class 2 -> 0.1
    [0.2, 0.7, 0.1],  # etc.
    [0.1, 0.7, 0.2],
    [0.8, 0.1, 0.1],
    [0.7, 0.2, 0.1],
])

In those cases, we can obtain a prediction for the top n most likely values:

# Get top n most likely values
n = 3

# Example: y_prob is numpy array
y_pred = np.argsort(y_prob, axis=1)[:, -n:]

# Example: y_prob is pytorch Tensor
y_pred = torch.topk(y_prob, n).indices.cpu().numpy()

This results in the prediction array:

array([[0, 1, 2],
       [1, 0, 2],
       [1, 2, 0],
       [0, 1, 2],
       [0, 1, 2]])

Usage examples

For all directly executable examples see the examples/ directory.

Top classification report

# Imports
import numpy as np
from toppred.metrics import top_classification_report

# Define inputs
y_true = np.asarray([1, 2, 3, 2, 1]) # Ground truth values
y_pred = np.asarray([                # Sample prediction values
    [1, 2, 3],                       # We have a top 3 predictions for each
    [2, 1, 3],                       # input sample. I.e., 
    [1, 2, 3],                       # y_true.shape[0] == y_pred.shape[0].
    [3, 1, 2],
    [1, 2, 3],
])

# Compute and print top classification report
print(top_classification_report(
    y_true = y_true,
    y_pred = y_pred,
    labels = [0, 1, 2, 3],                 # Optional, defaults to None
    target_names = ['N/A', '1', '2', '3'], # Optional, defaults to None
    sample_weight = [1, 2, 3, 4, 5],       # Optional, defaults to None
    digits = 4,                            # Optional, int, defaults to 2
    output_dict = False,                   # Optional, If true, return as dictionary
    zero_division = "warn",                # Optional, defaults to "warn"
))

Metrics

# Imports
import numpy as np
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from toppred.predictions import top_predictions

# Define inputs
y_true = np.asarray([1, 2, 3, 2, 1]) # Ground truth values
y_pred = np.asarray([                # Sample prediction values
    [1, 2, 3],                       # We have a top 3 predictions for each
    [2, 1, 3],                       # input sample. I.e., 
    [1, 2, 3],                       # y_true.shape[0] == y_pred.shape[0].
    [3, 1, 2],
    [1, 2, 3],
])

# Use top_predictions to generate a y_pred value that is correct if the
# prediction is in the top n predictions
for top, prediction in top_predictions(y_true, y_pred):
    # Compute common metrics
    accuracy  = accuracy_score (y_true, prediction)
    precision = precision_score(y_true, prediction, average='macro')
    recall    = recall_score   (y_true, prediction, average='macro')
    f1        = f1_score       (y_true, prediction, average='macro')

    print(f"Metrics top {top+1} predictions:")
    print(f"    Accuracy : {accuracy}")
    print(f"    Precision: {precision}")
    print(f"    Recall   : {recall}")
    print(f"    F1_score : {f1}")
    print()

API

This library offers two main functions:

top_classification_report()

Create a classification report for a y_pred containing multiple top predictions. This function follows the same API as sklearn.metrics.classification_report with the exception that:

  1. y_pred should be given as a 2D array instead of a 1D array.
  2. If output_dict is True, the output dictionary consists of a dictionary where key is 0-indexed top prediction and value is the sklearn.metrics.classification_report output dictionary for that top prediction.

Parameters

  • y_true : array_like_1d of shape=(n_samples,) Ground truth (correct) target values.
  • y_pred : array_like_2d of shape=(n_samples, n_predictions) Estimated targets as returned by a classifier. Each column y_pred[:, i] indicates the i-th most likely prediction (0-indexed) for the given sample.
  • labels : Optional[array_like_1d], default = None Optional list of label indices to include in the report.
  • target_names : Optional[List[str]] = None Optional display names matching the labels (same order).
  • sample_weight : Optional[array_like_1d], default = None Sample weights.
  • digits : int, default = 2 Number of digits for formatting output floating point values. When output_dict is True, this will be ignored and the returned values will not be rounded.
  • output_dict : bool, default = False If True, return output as dict.
  • zero_division : Union[Literal["warn"], 0, 1], default = "warn" Sets the value to return when there is a zero division. If set to “warn”, this acts as 0, but warnings are also raised.

Returns

  • report : Union[str, dict] Text summary of the precision, recall, F1 score for each class. Dictionary returned if output_dict is True.

    The reported averages include macro average (averaging the unweighted mean per label), weighted average (averaging the support-weighted mean per label), and sample average (only for multilabel classification). Micro average (averaging the total true positives, false negatives and false positives) is only shown for multi-label or multi-class with a subset of classes, because it corresponds to accuracy otherwise and would be the same for all metrics. See also precision_recall_fscore_support for more details on averages.

    Note that in binary classification, recall of the positive class is also known as “sensitivity”; recall of the negative class is “specificity”.

top_predictions()

Iterates over the top predictions.

Parameters

  • y_true : array_like_1d of shape=(n_samples,) True labels corresponding to samples.

  • y_pred : array_like_2d of shape=(n_samples, n_predictions) Predicted labels for samples. Each column y_pred[:, i] indicates the i-th most likely prediction (0-indexed) for the given sample.

Yields

  • i : int Top in the i most likely predictions (0-indexed).

  • y_pred : np.array of shape=(n_samples,) Prediction if the correct answer would be in the top i most likely predictions (0-indexed).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toppred-0.0.2.tar.gz (10.0 kB view hashes)

Uploaded Source

Built Distribution

toppred-0.0.2-py3-none-any.whl (9.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page