faster-eTaPR

No project description provided

These details have not been verified by PyPI

Project links

Project description

Faster implementation (~200x) of the enhanced time-aware precision and recall (eTaPR) from Hwang et al. The original implementation is saurf4ng/eTaPR and this implementation is fully tested against it.

Motivation

The motivation behind the eTaPR is that it is enough for a detection method to partially detect an anomaly segment, as along as an human expert can find the anomaly around this prediction. The following illustration (a recreation from the paper) highlights the four cases which are considered by eTaPR:

A successful detection: A human expert can likely find the anomaly \(A_1\) based on the prediction \(P_1\).
A failed detection: Only a small portion of the prediction \(P_2\) overlaps with the anomaly \(A_2\).
A failed detection: Most of the prediction \(P_3\) lies in the range of non-anomalous behavior (prediction starts too early). A human expert will likely regard the prediction \(P_3\) as incorrect or a false alarm. The prediction \(P_3\) is too imprecise and the anomaly \(A_3\) is likely to be missed.
A failed prediction: The prediction \(P_4\) mostly overlaps with the anomaly \(A_4\), but covers only a small portion of the actual anomaly segment. Thus, a human expert is likely to dismiss the prediction \(P_4\) as incorrect because the full extend of the anomaly remains hidden. The prediction P_4 contains insufficient information about the anomaly.

Note that for case 4, we could still mark the anomaly as detected, if there were more predictions which overlap with the anomaly \(A_4\). Specifically, the handling of the cases 3 and 4 is what sets eTaPR apart from other scoring methods.

If you want an in-depth explanation of the calculation, check out the documentation.

Getting Started

Install this package from PyPI using pip or uv:

pip install faster-etapr

uv pip install faster-etapr

Now, you run your evaluation in python:

import faster_etapr
faster_etapr.evaluate_from_preds(
    y_hat=[0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0],
    y=    [0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1],
    theta_p=0.5,
    theta_r=0.1,
)
{
    'eta/recall': 0.3875,
    'eta/recall_detection': 0.5,
    'eta/recall_portion': 0.275,
    'eta/detected_anomalies': 2.0,
    'eta/precision': 0.46476766302377037,
    'eta/precision_detection': 0.46476766302377037,
    'eta/precision_portion': 0.46476766302377037,
    'eta/correct_predictions': 2.0,
    'eta/f1': 0.4226312395393011,
    'eta/TP': 4,
    'eta/FP': 5,
    'eta/FN': 7,
    'eta/wrong_predictions': 2,
    'eta/missed_anomalies': 2,
    'eta/anomalies': 4,
    'eta/segments': 0.499999999999875,
    'point/recall': 0.45454545454541323,
    'point/precision': 0.5555555555554939,
    'point/f1': 0.49999999999945494,
    'point/TP': 5,
    'point/FP': 4,
    'point/FN': 6,
    'point/anomalies': 4,
    'point/detected_anomalies': 3.0,
    'point/segments': 0.75,
    'point_adjust/recall': 0.9090909090909091,
    'point_adjust/precision': 0.7142857142857143,
    'point_adjust/f1': 0.7999999999995071
}

We calculate three types of metrics:

the enhanced time-aware (eTa) metrics under eta/
the (traditional) point-wise metrics under point/
the point-adjusted metrics under point_adjust/

Benchmark

A little benchmark with randomly generated inputs (np.random.randint(0, 2, size=size)):

size	eTaPR_pkg	faster_etapr	factor
1 000	0.4090	0.0032	~125x
10 000	35.8264	0.1810	~198x
20 000	148.2670	0.6547	~226x
100 000	too long	55.04712

Citation

If you use eTaPR, please cite the original author/paper:

@inproceedings{10.1145/3477314.3507024,
author = {Hwang, Won-Seok and Yun, Jeong-Han and Kim, Jonguk and Min, Byung Gil},
title = {"Do You Know Existing Accuracy Metrics Overrate Time-Series Anomaly Detections?"},
year = {2022},
isbn = {9781450387132},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3477314.3507024},
doi = {10.1145/3477314.3507024},
booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},
pages = {403–412},
numpages = {10},
keywords = {accuracy metric, anomaly detection, precision, recall, time-series},
location = {Virtual Event},
series = {SAC '22}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Apr 8, 2024

0.1.1

Mar 26, 2024

0.1.0

Mar 26, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faster_etapr-0.1.2.tar.gz (240.3 kB view details)

Uploaded Apr 8, 2024 Source

Built Distribution

faster_etapr-0.1.2-py3-none-any.whl (12.5 kB view details)

Uploaded Apr 8, 2024 Python 3

File details

Details for the file faster_etapr-0.1.2.tar.gz.

File metadata

Download URL: faster_etapr-0.1.2.tar.gz
Upload date: Apr 8, 2024
Size: 240.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for faster_etapr-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`3567fb8a65a417ab317e4d186d75908841ef8f8d20dd558948b9c88cf04298c5`
MD5	`f07c9b801eac6d2a157520e3c452c0f8`
BLAKE2b-256	`dc91264b1f1944959c4d64794d4800d5f469ddbfd03e59263ed819bc0a8b8e27`

See more details on using hashes here.

File details

Details for the file faster_etapr-0.1.2-py3-none-any.whl.

File metadata

Download URL: faster_etapr-0.1.2-py3-none-any.whl
Upload date: Apr 8, 2024
Size: 12.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.0.0 CPython/3.12.2

File hashes

Hashes for faster_etapr-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`933b47796f13a975e680e5d9e76fcccfd9abe37ad8d59fcbedf75dd48f9cdc7c`
MD5	`7c384ebc4350ab072b6344e13352d951`
BLAKE2b-256	`5221e426a31ff3f39940e7e0dfa9501ead288c87345bfac0cead8ddb89d068d1`