Skip to main content

Determine (optimal) baselines for binary classification

Project description

DutchDraw

DutchDraw is a Python package for binary classification.

Paper

This package is an implementation of the ideas from INSERTONZEPAPER, where VERHAALWATWEINDEPAPERDOEN.

Citation

If you have used the DutchDraw package, please also cite: INSERTONZEBIBTEX

Installation

Use the package manager pip to install the package

pip install DutchDraw

Windows users

python -m pip install --upgrade  --index-url https://test.pypi.org/simple/ DutchDraw

or

py -m pip install --upgrade  --index-url https://test.pypi.org/simple/ DutchDraw

Method

To properly assess the performance of a binary classification model, the score of a chosen measure should be compared with the score of a 'simple' baseline. E.g. an accuracy of 0.9 isn't that great if a model (without knowledge) attains an accuracy of 0.88.

Basic baseline

Let M be the total number of samples, where P are positive and N are negative. Let θ_star = round(θ * M) / M. Randomly shuffle the samples and label the first θ_star * M samples as 1 and the rest as 0. This gives a baseline for each θ in [0,1]. Our package can optimize (maximize and minimize) the baseline.

Reasons to use

This package contains multiple functions. Let y_true be the actual labels and y_pred be the labels predicted by a model.

If:

  • You want to determine an included measure --> measure_score(y_true, y_pred, measure)
  • You want to get statistics of a baseline given theta --> baseline_functions_given_theta(theta, y_true, measure)
  • You want to get statistics of the optimal baseline --> optimized_baseline_statistics(y_true, measure)
  • You want the baseline without specifying theta --> baseline_functions(y_true, measure)

List of all included measures

Measure Definition
TP TP
TN TN
FP FP
FN FN
TPR TP / P
TNR TN / N
FPR FP / N
FNR FN / P
PPV TP / (TP + FP)
NPV TN / (TN + FN)
FDR FP / (TP + FP)
FOR FN / (TN + FN)
ACC, ACCURACY (TP + TN) / M
BACC, BALANCED ACCURACY (TPR + TNR) / 2
FBETA, FSCORE, F, F BETA, F BETA SCORE, FBETA SCORE ((1 + β2) * TP) / ((1 + β2) * TP + β2 * FN + FP)
MCC, MATTHEW, MATTHEWS CORRELATION COEFFICIENT (TP * TN - FP * FN) / (sqrt((TP + FP) * (TN + FN) * P * N))
BM, BOOKMAKER INFORMEDNESS, INFORMEDNESS TPR + TNR - 1
MK PPV + NPV - 1
COHEN, COHENS KAPPA, KAPPA (Po - Pe) / (1 - Pe) with Po = (TP + TN) / M and
Pe = ((TP + FP) / M) * (P / M) + ((TN + FN) / M) * (N / M)
G1, GMEAN1, G MEAN 1, FOWLKES-MALLOWS, FOWLKES MALLOWS, FOWLKES, MALLOWS sqrt(TPR * PPV)
G2, GMEAN2, G MEAN 2 sqrt(TPR * TNR)
TS, THREAT SCORE, CRITICAL SUCCES INDEX, CSI TP / (TP + FN + FP)
PT, PREVALENCE THRESHOLD (sqrt(TPR * FPR) - FPR) / (TPR - FPR)

Usage

As example, we first generate the true and predicted labels.

import random
random.seed(123) # To ensure similar outputs

y_pred = random.choices((0,1), k = 10000, weights = (0.9, 0.1))
y_true = random.choices((0,1), k = 10000, weights = (0.9, 0.1))

Measure performance

In general, to determine the score of a measure, use measure_score(y_true, y_pred, measure, beta = 1).

Input

  • y_true (list or numpy.ndarray): 1-dimensional boolean list/numpy.ndarray containing the true labels.

  • y_pred (list or numpy.ndarray): 1-dimensional boolean list/numpy containing the predicted labels.

  • measure (string): Measure name, see all_names_except(['']) for possible measure names.

  • beta (float): Default is 1. Parameter for the F-beta score.

Output

  • float: The score of the given measure evaluated with the predicted and true labels.

Example

To examine the performance of the predicted labels, we measure the markedness (MK) and F2 score (FBETA).

import DutchDraw as bbl

# Measuring markedness (MK):
print('Markedness: {:06.4f}'.format(bbl.measure_score(y_true, y_pred, measure = 'MK')))

# Measuring FBETA for beta = 2:
print('F2 Score: {:06.4f}'.format(bbl.measure_score(y_true, y_pred, measure = 'FBETA', beta = 2)))

This returns as output

Markedness: 0.0061
F2 Score: 0.1053

Note that FBETA is the only measure that requires an additional parameter value.


Get basic baseline given theta

To obtain the basic baseline given theta use baseline_functions_given_theta(theta, y_true, measure, beta = 1).

Input

  • theta (float): Parameter for the shuffle baseline.

  • y_true (list or numpy.ndarray): 1-dimensional boolean list/numpy.ndarray containing the true labels.

  • measure (string): Measure name, see all_names_except(['']) for possible measure names.

  • beta (float): Default is 1. Parameter for the F-beta score.

Output

The function baseline_functions_given_theta gives the following output:

  • dict: Containing Mean and Variance
    • Mean (float): Expected baseline given theta.
    • Variance (float): Variance baseline given theta.

Example

To evaluate the performance of a model, we want to obtain a baseline for the F2 score (FBETA).

results_baseline = bbl.baseline_functions_given_theta(theta = 0.5, y_true = y_true, measure = 'FBETA', beta = 2)

This gives us the mean and variance of the baseline.

print('Mean: {:06.4f}'.format(results_baseline['Mean']))
print('Variance: {:06.4f}'.format(results_baseline['Variance']))

with output

Mean: 0.2829
Variance: 0.0001

Get basic baseline

To obtain the basic baseline without specifying theta use baseline_functions(y_true, measure, beta = 1).

Input

  • y_true (list or numpy.ndarray): 1-dimensional boolean list/numpy.ndarray containing the true labels.

  • measure (string): Measure name, see all_names_except(['']) for possible measure names.

  • beta (float): Default is 1. Parameter for the F-beta score.

Output

The function baseline_functions gives the following output:

  • dict: Containing Distribution, Domain, (Fast) Expectation Function and Variance Function.

    • Distribution (function): Pmf of the measure, given by: pmf_Y(y, theta), where y is a measure score and theta is the parameter of the shuffle baseline.

    • Domain (function): Function that returns attainable measure scores with argument theta.

    • (Fast) Expectation Function (function): Expectation function of the baseline with theta as argument. If Fast Expectation Function is returned, there exists a theoretical expectation that can be used for fast computation.

    • Variance Function (function): Variance function for all values of theta.

Example

Next, we determine the baseline without specifying theta. This returns a number of functions that can be used for different values of theta.

baseline = bbl.baseline_functions(y_true = y_true, measure = 'G2')
print(baseline.keys())

with output

dict_keys(['Distribution', 'Domain', 'Fast Expectation Function', 'Variance Function', 'Expectation Function'])

To inspect the expected value of G2 for different theta values, we do:

import matplotlib.pyplot as plt
theta_values = np.arange(0, 1 + 0.01, 0.01)
expected_value_plot = [baseline['Expectation Function'](theta) for theta in theta_values]
plt.plot(theta_values, expected_value_plot)
plt.xlabel('Theta')
plt.ylabel('Expected value')
plt.show()

with output:

expectation example

The variance can be determined similarly

theta_values = np.arange(0, 1 + 0.01, 0.01)
variance_plot = [baseline['Variance Function'](theta) for theta in theta_values]
plt.plot(theta_values, variance_plot)
plt.xlabel('Theta')
plt.ylabel('Variance')
plt.show()

with output:

expectation example

Distribution is a function with two arguments: y and theta. Let's investigate the distribution for theta = 0.5 using Domain.

theta = 0.5
pmf_values = [baseline['Distribution'](y, theta) for y in baseline['Domain'](theta)]
plt.plot(baseline['Domain'](theta), pmf_values)
plt.xlabel('Measure score')
plt.ylabel('Probability mass')
plt.show()

with output:

expectation example


Get optimal baseline

To obtain the optimal baseline use optimized_baseline_statistics(y_true, measure = possible_names, beta = 1).

Input

  • y_true (list or numpy.ndarray): 1-dimensional boolean list/numpy.ndarray containing the true labels.

  • measure (string): Measure name, see all_names_except(['']) for possible measure names.

  • beta (float): Default is 1. Parameter for the F-beta score.

Output

The function optimized_baseline_statistics gives the following output:

  • dict: Containing Max Expected Value, Argmax Expected Value, Min Expected Value and Argmin Expected Value.
    • Max Expected Value (float): Maximum of the expected values for all theta.
    • Argmax Expected Value (list): List of all theta_star values that maximize the expected value.
    • Min Expected Value (float): Minimum of the expected values for all theta.
    • Argmin Expected Value (list): List of all theta_star values that minimize the expected value.

Note that theta_star = round(theta * M) / M.

Example

To evaluate the performance of a model, we want to obtain the optimal baseline for the F2 score (FBETA).

optimal_baseline = bbl.optimized_baseline_statistics(y_true, measure = 'FBETA', beta = 1)

print('Max Expected Value: {:06.4f}'.format(optimal_baseline['Max Expected Value']))
print('Argmax Expected Value: {:06.4f}'.format(*optimal_baseline['Argmax Expected Value']))
print('Min Expected Value: {:06.4f}'.format(optimal_baseline['Min Expected Value']))
print('Argmin Expected Value: {:06.4f}'.format(*optimal_baseline['Argmin Expected Value']))

with output

Max Expected Value: 0.1874
Argmax Expected Value: 1.0000
Min Expected Value: 0.0000
Argmin Expected Value: 0.0000

All example code

import DutchDraw as bbl
import random
import numpy as np

random.seed(123) # To ensure similar outputs

# Generate true and predicted labels
y_pred = random.choices((0,1), k = 10000, weights = (0.9, 0.1))
y_true = random.choices((0,1), k = 10000, weights = (0.9, 0.1))

######################################################
# Example function: measure_score
print('\033[94mExample function: `measure_score`\033[0m')
# Measuring markedness (MK):
print('Markedness: {:06.4f}'.format(bbl.measure_score(y_true, y_pred, measure = 'MK')))

# Measuring FBETA for beta = 2:
print('F2 Score: {:06.4f}'.format(bbl.measure_score(y_true, y_pred, measure= 'FBETA', beta = 2)))

print('')
######################################################
# Example function: baseline_functions_given_theta
print('\033[94mExample function: `baseline_functions_given_theta`\033[0m')
results_baseline = bbl.baseline_functions_given_theta(theta = 0.5, y_true = y_true, measure = 'FBETA', beta = 2)

print('Mean: {:06.4f}'.format(results_baseline['Mean']))
print('Variance: {:06.4f}'.format(results_baseline['Variance']))

print('')
######################################################
# Example function: baseline_functions
print('\033[94mExample function: `baseline_functions`\033[0m')
baseline = bbl.baseline_functions(y_true = y_true, measure = 'G2')
print(baseline.keys())


# Expected Value
import matplotlib.pyplot as plt
theta_values = np.arange(0, 1 + 0.01, 0.01)
expected_value_plot = [baseline['Expectation Function'](theta) for theta in theta_values]
plt.plot(theta_values, expected_value_plot)
plt.xlabel('Theta')
plt.ylabel('Expected value')
#plt.savefig('expected_value_function_example.png', dpi= 600)
plt.show()

# Variance
theta_values = np.arange(0, 1 + 0.01, 0.01)
variance_plot = [baseline['Variance Function'](theta) for theta in theta_values]
plt.plot(theta_values, variance_plot)
plt.xlabel('Theta')
plt.ylabel('Variance')
#plt.savefig('variance_function_example.png', dpi= 600)
plt.show()

# Distribution and Domain
theta = 0.5
pmf_values = [baseline['Distribution'](y, theta) for y in baseline['Domain'](theta)]
plt.plot(baseline['Domain'](theta), pmf_values)
plt.xlabel('Measure score')
plt.ylabel('Probability mass')
#plt.savefig('pmf_example.png', dpi= 600)
plt.show()

print('')
######################################################
# Example function: optimized_baseline_statistics
print('\033[94mExample function: `optimized_baseline_statistics`\033[0m')
optimal_baseline = bbl.optimized_baseline_statistics(y_true, measure = 'FBETA', beta = 1)

print('Max Expected Value: {:06.4f}'.format(optimal_baseline['Max Expected Value']))
print('Argmax Expected Value: {:06.4f}'.format(*optimal_baseline['Argmax Expected Value']))
print('Min Expected Value: {:06.4f}'.format(optimal_baseline['Min Expected Value']))
print('Argmin Expected Value: {:06.4f}'.format(*optimal_baseline['Argmin Expected Value']))

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DutchDraw-0.0.2.tar.gz (12.8 kB view hashes)

Uploaded Source

Built Distribution

DutchDraw-0.0.2-py3-none-any.whl (12.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page