A package for computing informedness, markedness and phi score

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

metrify

Installation

To install metrify, simply run pip install metrify.

Purpose

The purpose of metrify is to add a few simple functionalities to scikit-learn. In particular, the informedness, markedness and $\phi_\beta$ scores are introduced. Furthermore, some other utilities related to more traditional metrics (e.g. $F_\beta$) are introduced, too. The reason for introducing them (and hence their utility) stems from the following observations.

The main issue with precision, recall and F-score is that none of them makes use of True Negatives (TNs) to assess the model performance. Whilst we are usually interested in the positive class, ensuring to get the negative class right is a sort of sanity check for our model. For example, if we had a highly imbalanced dataset (towards the negative class) we definitely want the negative class to be predicted with high confidence. Our job should be, in theory, to ensure that the model performs well on the minority, i.e. the positive, class.

In practice, suppose we have a model whose predictions yield, in terms of False Positives, False Negatives and True Positives $$TP = 50, FP = 1000, FN = 10$$ Even though we can compute e.g. the F-score given this information, this metric totally neglects if we had, for example, $TN = 100000$ or $TN = 100$ which would indeed make a very big difference when assessing the model performance. This is a signal that we are not really capturing the information about the model. Whilst it is clear that the model is not satisfactory in either case, as we have an abundance of $FP$s, the latter situation is much more worrisome. In fact, although the negative class is the majority, we are basically getting it totally wrong by always predicting the positive one. This is likely indicating that our model has some inherent issue in its construction, e.g. setting a threshold too low or something similar.

To make sure our metrics do take into consideration this case, we need an alternative to precision and recall. One option is to use markedness $M$ and informedness $I$. Formulas are the following: $$M = \frac{TP}{TP + FP} - \frac{FN}{TN+FN} \qquad I = \frac{TP}{TP + FN} - \frac{FP}{TN+FP}$$ Markedness plays the role of precision, and is informative about the role of $FP$s in our model. Conversely, Informedness plays the role of recall, and gauges the importance of $FN$s. However, in either case, the $TN$s do enter the game and help bring these metrics down. In fact, both of these are bounded between -1 and +1, with -1 corresponding to the worst scenario (no TP, and no TN). It is also possible to combine them into something reminiscent of F-score. I called it $\phi$-score. Similarly to the F-score, it can be weighted by a real parameter $\beta$, so as to give more or less importance to either $M$ or $I$. $$\phi_\beta = (1 + \beta^2) \frac{I \cdot M}{\beta^2 M + I}$$

Usage

Informedness, Markedness, $\phi$

Currently metrify only works for a binary classification problem. A sample usage is the following

from metrify import informedness, markedness, phi_beta

# Define a random numpy array of 100 true values
t = np.random.randint(0, 2, 100)

# Define a random numpy array of 100 predicted values
p = np.random.randint(0, 2, 100)

# Compute the metrics
i = informedness(t, p)
m = markedness(t, p)
phi_2 = phi_beta(t, p, beta=2)

$F_\beta$

Given a set of binary ground truths and predictions as probabilities, find the best $F_\beta$ and corresponding threshold:

from metrify import find_best_fbeta

# Define a random numpy array of 100 true values
t = np.random.randint(0, 2, 100)

# Define a random numpy array of 100 predicted probabilities
p = np.random.rand(100)

# Get the best F0.5 and the corresponding threshold
f_beta, threshold = find_best_fbeta(t, p, beta=0.5)

New Versions

To create and deploy a new version, the recipe is:

Modify the code as suited;
Update the version and possibly the dependencies in pyproject.toml;
Build the package python -m build;
Upload the package python -m twine upload dist/metrify-<VERSION>*

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.3.2

Apr 19, 2024

This version

0.1.3.1

Nov 13, 2023

0.1.3

Nov 13, 2023

0.1.2

Oct 26, 2023

0.1.1

Oct 26, 2023

0.1.0

Oct 26, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

metrify-0.1.3.1.tar.gz (5.8 kB view details)

Uploaded Nov 13, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

metrify-0.1.3.1-py3-none-any.whl (5.0 kB view details)

Uploaded Nov 13, 2023 Python 3

File details

Details for the file metrify-0.1.3.1.tar.gz.

File metadata

Download URL: metrify-0.1.3.1.tar.gz
Upload date: Nov 13, 2023
Size: 5.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for metrify-0.1.3.1.tar.gz
Algorithm	Hash digest
SHA256	`f653d3f2312da803fca03d1ce419f5f73dc9f024cc0008ad7c535431a7c68b82`
MD5	`92305f73e904b08d4fe7ae4319ca12c5`
BLAKE2b-256	`eb6946a6bbfc5bcfadfc824636c9066bb2731ba7ad84ccb1de6ffd6ed975751c`

See more details on using hashes here.

File details

Details for the file metrify-0.1.3.1-py3-none-any.whl.

File metadata

Download URL: metrify-0.1.3.1-py3-none-any.whl
Upload date: Nov 13, 2023
Size: 5.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for metrify-0.1.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`938d64096e24f3308e4ccc6ee8c135ea77160a25450eaead3eddab88e54de237`
MD5	`9811dbd4ad4005d6791a15db360ffdbd`
BLAKE2b-256	`42ffd818bdfe917fe7653308fd8ef3c1a0787f5a1fe9cf4b7abe54a06d42564a`

See more details on using hashes here.

metrify 0.1.3.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

metrify

Installation

Purpose

Usage

Informedness, Markedness, $\phi$

$F_\beta$

New Versions

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes