Find threshold for fine-tuning output from predict_proba
Project description
Thresher - THRESHold EvaluatoR
A python tool for finding a threshold which maximizes accuracy
of predict_proba
like-outputs in regard to the ground truth.
Project description
Method interesting for the user is optimize_threshold(scores, actual_classes)
, which is available
from the Thresher
class. This method, for given scores and actual classes,
returns a threshold that yields the highest fraction
of correctly classified samples.
optimize_threshold parameters:
scores:list
The list of scores.
actual_classes:list
The list of ground truth (correct) classes.
Classes are represented as -1 and 1.
returns:
threshold:float
The threshold value that yields the highest fraction of correctly classified
samples. If multiple thresholds give the optimal fraction, return any threshold.
How to setup?
The process is rather straightforward, you just need to just whether to install from the sources (latest revision), or from the PyPI repository (stable release).
Requirements
Tested with Python 3.7+
, on a standard Unix environment
Installation
Installation from source:
pip install git+https://github.com/oskar-j/thresher.git
Stable release using the pip
tool:
pip install thresher-py
Sample usage
import thresher
t = thresher.Thresher()
print('Currently supported algorithms:')
print(t.get_supported_algorithms())
cases = [0.1, 0.3, 0.4, 0.7]
actual_labels = [-1, -1, 1, 1]
print(f'Optimization result: {t.optimize_threshold(cases, actual_labels)}')
Future work
- adding more algorithms
- publishing on conda
- more heavy test loads
- python docs
- CI pipeline for automated tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.