Receiver Operating Characteristic (ROC) functions package.
Project description
ROCFunctions basic usage
This repository has the code of a Python package for Receiver Operating Characteristic (ROC) functions.
The ROC framework is used for analysis and tuning of binary classifiers, [Wk1]. (The classifiers are assumed to classify into a positive/true label or a negative/false label. )
For computational introduction to ROC utilization (in Mathematica) see the article "Basic example of using ROC with Linear regression" , [AA1].
The examples below use the package "RandomDataGenerators", [AA2].
Installation
From PyPI.org:
python3 -m pip install ROCFunctions
Usage examples
Properties
Here are some retrieval functions:
import pandas
from ROCFunctions import *
print(roc_functions("properties"))
['FunctionInterpretations', 'FunctionNames', 'Functions', 'Methods', 'Properties']
print(roc_functions("FunctionInterpretations"))
{'TPR': 'true positive rate', 'TNR': 'true negative rate', 'SPC': 'specificity', 'PPV': 'positive predictive value', 'NPV': 'negative predictive value', 'FPR': 'false positive rate', 'FDR': 'false discovery rate', 'FNR': 'false negative rate', 'ACC': 'accuracy', 'AUROC': 'area under the ROC curve', 'FOR': 'false omission rate', 'F1': 'F1 score', 'MCC': 'Matthews correlation coefficient', 'Recall': 'same as TPR', 'Precision': 'same as PPV', 'Accuracy': 'same as ACC', 'Sensitivity': 'same as TPR'}
print(roc_functions("FPR"))
<function FPR at 0x7f7612f48050>
Single ROC record
Definition: A ROC record (ROC-dictionary, or ROC-hash, or ROC-hash-map) is an associative object that has the keys: "FalseNegative", "FalsePositive", "TrueNegative", "TruePositive".Here is an example:
{"FalseNegative": 50, "FalsePositive": 51, "TrueNegative": 60, "TruePositive": 39}
{'FalseNegative': 50,
'FalsePositive': 51,
'TrueNegative': 60,
'TruePositive': 39}
Here we generate a random "dataset" with columns "Actual" and "Predicted" that have the values "true" and "false"and show the summary:
from RandomDataGenerators import *
dfRandomLabels = random_data_frame(200, ["Actual", "Predicted"],
generators={"Actual": ["true", "false"],
"Predicted": ["true", "false"]})
dfRandomLabels.shape
(200, 2)
Here is a sample of the dataset:
print(dfRandomLabels[:4])
Actual Predicted
0 false false
1 false false
2 false false
3 true false
Here we make the corresponding ROC dictionary:
to_roc_dict('true', 'false',
list(dfRandomLabels.Actual.values),
list(dfRandomLabels.Predicted.values))
{'TruePositive': 52,
'FalsePositive': 48,
'TrueNegative': 50,
'FalseNegative': 50}
Multiple ROC records
Here we make random dataset with entries that associated with a certain threshold parameter with three unique values:
dfRandomLabels2 = random_data_frame(200, ["Threshold", "Actual", "Predicted"],
generators={"Threshold": [0.2, 0.4, 0.6],
"Actual": ["true", "false"],
"Predicted": ["true", "false"]})
Remark: Threshold parameters are typically used while tuning Machine Learning (ML) classifiers. Here we find and print the ROC records(dictionaries) for each unique threshold value:
thresholds = list(dfRandomLabels2.Threshold.drop_duplicates())
rocGroups = {}
for x in thresholds:
dfLocal = dfRandomLabels2[dfRandomLabels2["Threshold"] == x]
rocGroups[x] = to_roc_dict('true', 'false',
list(dfLocal.Actual.values),
list(dfLocal.Predicted.values))
rocGroups
{0.4: {'TruePositive': 13,
'FalsePositive': 23,
'TrueNegative': 24,
'FalseNegative': 12},
0.2: {'TruePositive': 18,
'FalsePositive': 11,
'TrueNegative': 19,
'FalseNegative': 18},
0.6: {'TruePositive': 23,
'FalsePositive': 9,
'TrueNegative': 16,
'FalseNegative': 14}}
Application of ROC functions
Here we define a list of ROC functions:
funcs = ["PPV", "NPV", "TPR", "ACC", "SPC", "MCC"]
Here we apply each ROC function to each of the ROC records obtained above:
import pandas
rocRes = { k : {f: roc_functions(f)(v) for f in funcs} for (k, v) in rocGroups.items()}
print(pandas.DataFrame(rocRes))
0.4 0.2 0.6
PPV 0.361111 0.620690 0.718750
NPV 0.666667 0.513514 0.533333
TPR 0.520000 0.500000 0.621622
ACC 0.513889 0.560606 0.629032
SPC 0.510638 0.633333 0.640000
MCC 0.030640 0.134535 0.261666
References
Articles
[Wk1] Wikipedia entry, "Receiver operating characteristic".
[AA1] Anton Antonov, "Basic example of using ROC with Linear regression" , (2016), MathematicaForPrediction at WordPress.
[AA2] Anton Antonov, "Introduction to data wrangling with Raku" , (2021), RakuForPrediction at WordPress.
Packages
[AAp1] Anton Antonov, ROCFunctions Mathematica package, (2016-2022), MathematicaForPrediction at GitHub/antononcube.
[AAp2] Anton Antonov, ROCFunctions R package, (2021), R-packages at GitHub/antononcube.
[AAp3] Anton Antonov, ML::ROCFunctions Raku package, (2022), GitHub/antononcube.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ROCFunctions-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2753971b1d2498d9512d7d3d5d1dc9bb7ae259d79a9bbe871538536defe49000 |
|
MD5 | 563dc42d45a28c7a7f156310a7a1c741 |
|
BLAKE2b-256 | 30dff8ca502ec5f5132c42d46311c2d9a37a0a7c2bdf9b0b99aec3c58e6d5ae4 |