Skip to main content

Python package to measure the calibration of probabilistic classifiers

Project description

classifier-calibration

Measure the calibration of probabilistic classifers.

For some forecasting applications, we are more interested in predicting the probability of an event occurring, than predicting which event will occur (the label). In this case we use probabilistic classifiers, to classify data points as having a certain probability of belonging to each class. Examples of such applications include weather forecasting (e.g. percentage chance of rain), medical diagnosis of disease (e.g. percentage risk of cancer) among others.

The module calibration_error has the attribute classwise_ece that calculates the classwise expected calibration error (as defined in Kull et al. (2019)) of a set of predictions given the predicted probabilities and the true labels.

The error can be calculated for any number of classes in a multi-class classification problem. The input includes a k-dimensional array where the $i^{th}$ column contains, for each data point, the predicted probability of the point belonging to class $i$ for $i = 1,2,...,k$. This is the format in which sklearn's predict_proba() method returns the array of predicted probabilities. Labels should be numerical, and should correspond to each class $i$ for $i = 1,2,...,k$.

classwise_ece() returns the classwise expected calibration error which is a loss bounded between 0 and 1. This loss is the average across all classes, of the weighted average deviation from the expected rate of occurrence of the given class. If we multiply this loss by 100, we can think of the result as the percentage by which the forecasting model's predicted probability deviates from the true probability, on average.

Example Usage

We demonstrate this calculation for the binary and multiclass problems, using common sklearn classifiers.

import sklearn.datasets
import sklearn.linear_model
import sklearn.ensemble
import pandas as pd
import numpy as np
from classifier_calibration.calibration_error import classwise_ece  

Binary classification

# Make classification data
X, Y = sklearn.datasets.make_classification(n_samples=100000,n_informative=6,n_classes=2) 
clf = sklearn.linear_model.LogisticRegression().fit(X,Y)
predicted_probabilities = clf.predict_proba(X)

# calculate the classwise expected calibration error
classwise_expected_calibration_error = classwise_ece(pred_probs=predicted_probabilities,labels=Y)

Multi-class classification

We compare calibration of random forest and logistic regression classifiers for the 3-class classification problem. We also return the distribution of weights across bins for each class. That is, we split the interval [0,1] into bins of equal length (num_bins=20 by default but can be specified in classwise_ece()) and group the predicted probabilities into these bins. The proportion of data points associated with each bin (for a given class) is the called the weight of the bin.

# Make classification data
num_classes = 3
X, Y = sklearn.datasets.make_classification(n_samples=100000,n_informative=6,n_classes=num_classes) 
lr = sklearn.linear_model.LogisticRegression().fit(X,Y)
rf = sklearn.ensemble.RandomForestClassifier().fit(X,Y)

lr_predicted_probabilities = lr.predict_proba(X)
rf_predicted_probabilities = rf.predict_proba(X)


# calculate the classwise expected calibration error
lr_classwise_expected_calibration_error, lr_bin_weights = classwise_ece(pred_probs=lr_predicted_probabilities,labels=Y,return_weights=True)
rf_classwise_expected_calibration_error, rf_bin_weights = classwise_ece(pred_probs=rf_predicted_probabilities,labels=Y,return_weights=True)

print(round(100*lr_classwise_expected_calibration_error,2),'%',round(100*rf_classwise_expected_calibration_error,2),'%')

# print distribution of weights across bins for each class' set of predictions
for k in range(num_classes):
    print('Logistic regression bin weights for class',k,'predictions:')
    print(lr_bin_weights[k])
    print('Random forest bin weights for class',k,'predictions:')
    print(rf_bin_weights[k])
    print()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

classifier_calibration-0.1.6.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

classifier_calibration-0.1.6-py2.py3-none-any.whl (7.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file classifier_calibration-0.1.6.tar.gz.

File metadata

File hashes

Hashes for classifier_calibration-0.1.6.tar.gz
Algorithm Hash digest
SHA256 915951c29296a48dc5a34a29eea32b6a33156afbe399c3918d637afd9d05e837
MD5 1a762163167656883d7e17dcce75c6ec
BLAKE2b-256 4bcbeaa038b9a592fd8aa75529cc65b5a92f972153aed1d71e0a698c46eb8d36

See more details on using hashes here.

File details

Details for the file classifier_calibration-0.1.6-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for classifier_calibration-0.1.6-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 20eecafa8bb0338eb4c0d4f9cc3710be06ad64cc03a8abe0bb584ccfae3e34a3
MD5 da71ff1a33a04174b1f637a414051318
BLAKE2b-256 bcaf5e323c0463a276e96b6b1e3dfc9eb733aea74438386e0db7a2e17a5e24cb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page