Methods for computing the expected cost metric for evaluation of classification systems

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Expected cost

Methods for computing the expected cost (EC) on an evaluation dataset, as defined in statistical learning text books (e.g., Bishop's "Pattern recognition and machine learning", and Hastie's et al "The elements of statistical learning"). Given a matrix of user-defined costs with elements $c_{y\ d}$, where $y$ is the true class of a sample and $d$ is the decision made by the system for that sample, this metric is estimated as the average of the costs across all samples in the dataset. That is:

$\mathrm{EC} = \frac{1}{N} \sum_i c_{y_i,d_i}$

where the sum runs over the $N$ samples in the evaluation set and $c_{y_i,d_i}$ is the cost incurred at sample $i$.

The EC is a generalization of the total error (which, in turn, is 1 minus the accuracy) and the balanced total error (which is 1 minus the balanced accuracy). The generalization is in the following ways: (1) it allows for costs that are different for each type of error, and (2) it allows for decisions that do not correspond one to one to the classes (e.g., it allows for the introduction of an "abstain" decision). The EC comes with an elegant theory on how to make optimal decisions given a certain set of costs, and it enables analysis of calibration. For these reasons we believe it is superior to other commonly used classification metrics, like the F-beta score or the Mathews correlation coefficient. All these issues are discussed in detail in:

L. Ferrer, "Analysis and Comparison of Classification Metrics", arXiv:2209.05355

The results in the paper can be replicated with the code in the examples directory in this repository.

The code provides methods for computing the EC when decisions are given by:

hard decisions obtained with some external method,
Bayes decisions made by optimizing the cost given the scores from a system assuming they can be used to obtain well-calibrated posteriors, or
optimal decisions made by choosing the decision threshold that minimizes the cost. This last option is only applicable for the binary case and the standard square cost function.

The scripts in the examples directory can be used with any dataset of scores and targets. See the examples/data.py file for examples on how to load your own data in the format required by the examples.

How to install

You can install this package as:

pip install expected_cost

which will also install all the dependencies. Some of the notebooks in this repository also require the psrcal package, which you can install as:

pip install psrcal

This is not included in the requirements of expected_cost because its installation requires pytorch, which takes a while. If you only need to compute expected cost or make Bayes decisions, and do not want to do or evaluate calibration, you do not need psrcal (or pytorch).

If you want the latest stuff, along with all the notebooks, you can do the following:

Clone this repository:

git clone https://github.com/luferrer/expected_cost.git
Install the requirements:

pip install -r requirements.txt

(You can delete the psrcal line if you do not need calibration capabilities).
Add the resulting top directory in your PYTHONPATH. In bash this would be:

export PYTHONPATH=ROOT_DIR/expected_cost:$PYTHONPATH

where ROOT_DIR is the absolute path (or the relative path from the directory where you have the scripts or notebooks you want to run) to the top directory from where you did the clone above.

You can now run any notebook in the notebooks directory.

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

1.0

Jan 15, 2025

0.0.6

Apr 17, 2024

0.0.5

Sep 1, 2023

0.0.4

Aug 31, 2023

0.0.3

Jul 6, 2023

0.0.2

Jun 29, 2023

0.0.1

Jun 28, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

expected_cost-1.0.tar.gz (26.6 kB view details)

Uploaded Jan 15, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

expected_cost-1.0-py3-none-any.whl (28.7 kB view details)

Uploaded Jan 15, 2025 Python 3

File details

Details for the file expected_cost-1.0.tar.gz.

File metadata

Download URL: expected_cost-1.0.tar.gz
Upload date: Jan 15, 2025
Size: 26.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for expected_cost-1.0.tar.gz
Algorithm	Hash digest
SHA256	`4be2c301d41f7cf21049451fdd91bb837c083caff8fb69ac5e437f34d28cf0c5`
MD5	`22277dcc4a20304e72750393cd4b8067`
BLAKE2b-256	`5d962f533258f3693daacb4546dc218519f21706209d9a955e03320c4dc30713`

See more details on using hashes here.

File details

Details for the file expected_cost-1.0-py3-none-any.whl.

File metadata

Download URL: expected_cost-1.0-py3-none-any.whl
Upload date: Jan 15, 2025
Size: 28.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.13

File hashes

Hashes for expected_cost-1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8753a6df696cbe323bd8d18ae310a3f63f85797b6a41aee58ed28f1b4464ea35`
MD5	`3dfb5724d181f24ca1cd405a86f033b6`
BLAKE2b-256	`68c2bcbc654b584f2bc1e6c80fa41caa1319262c22f61718a487b68f691f5b4a`

See more details on using hashes here.

expected-cost 1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Expected cost

How to install

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes