Skip to main content

PATE: Proximity-Aware Time series anomaly Evaluation metric

Project description

PATE: Proximity-Aware Time series anomaly Evaluation

This repository contains the code for PATE (Proximity-Aware Time series anomaly Evaluation measure), a novel evaluation metric for assessing anomaly detection in time series data. PATE introduces proximity-based weighting and computes a weighted version of the area under the Precision-Recall curve, offering a more accurate and fair evaluation by considering the temporal relationship between predicted and actual anomalies. The methodology is detailed in our paper, showcasing its effectiveness through experiments with both synthetic and real-world datasets.

Installation Instructions

Set Up the Environment

To use PATE, start by creating and activating a new Conda environment using the following commands:

conda create --name pate_env python=3.8
conda activate pate_env

Install Dependencies

Install the required Python packages via:

pip install -r base_requirements.txt

How to use PATE?

Utilizing PATE is straightforward. navigate to PATE code directory and begin by importing the PATE module in your Python script:

from PATE_metric import PATE

Prepare your input as arrays of anomaly scores (continues or binary) and binary labels. PATE allows for comprehensive customization of parameters, enabling easy toggling between PATE and PATE-F1 evaluations. Please refer to the main code documentation for a full list of configurable options.

Example usage of PATE and PATE-F1:

pate = PATE(labels, anomaly_scores, binary_scores = False)
pate_f1 = PATE(labels, binary_anomaly_scores, binary_scores = True)

Conducting Experiments

with Synthetic Data

To run experiments on synthetic data, navigate to the experiments/Synthetic_Data_Experiments directory and execute the main Python script. This script allows for the modification of various scenarios, comparing PATE and PATE-F1 against other established metrics.

cd experiments/Synthetic_Data_Experiments
python main_synthetic_data.py

Example of how you use PATE using synthetic data (Binary detector):

from utils_Synthetic_exp import evaluate_all_metrics, synthetic_generator


label_anomaly_ranges = [[40,59]] # You can selec multiple ranges for anomaly. Here we selected one range with the size of 20 points (A_k) 
predicted_ranges = [[30, 49]]  # You can selec multiple ranges for predictions. Here we selected the range the same as Scenario 2, proposed in the original paper. 
vus_zone_size = e_buffer = d_buffer = 20 

experiment_results = synthetic_generator(label_anomaly_ranges, predicted_ranges, vus_zone_size, e_buffer, d_buffer)
predicted_array = experiment_results["predicted_array"]
label_array = experiment_results["label_array"]


score_list_simple = evaluate_all_metrics(predicted_array, label_array, vus_zone_size, e_buffer, d_buffer)
print(score_list_simple)
Output:

'original_F1Score': 0.5,
'pa_precision': 0.67,
'pa_recall': 1.0,
'pa_f_score': 0.8,
'Rbased_precision': 0.6,
'Rbased_recall': 0.6,
'Rbased_f1score': 0.6,
'eTaPR_precision': 0.75,
'eTaPR_recall': 0.75,
'eTaPR_f1_score': 0.75,
'Affiliation precision': 0.97,
'Affiliation recall': 0.99,
'Affliation F1score': 0.98,
'VUS_ROC': 0.79,
'VUS_PR': 0.72,
'AUC': 0.74,
'AUC_PR': 0.51,

'PATE': 0.76,
'PATE-F1': 0.75}

with Real-World Data

For real-world data experiments, ensure all additional required packages are installed.

pip install -r Real_exp_requirements.txt

Download the Dataset

The datasets for these experiments can be downloaded from the following link:

Dataset Link: https://www.thedatum.org/datasets/TSB-UAD-Public.zip

Ref: This dataset is made available through the GitHub page of the project "An End-to-End Benchmark Suite for Univariate Time-Series Anomaly Detection (TSB-UAD)": https://github.com/TheDatumOrg/TSB-UAD

Running the Experiments

After downloading, place the unzipped dataset in the same directory. If you store the data in a different location, ensure you update the directory paths in the code to match.

Navigate to the experiments/RealWorld_Data_Experiments directory to run an experiment. Execute one of the example Python scripts by entering the following command:

cd experiments/RealWorld_Data_Experiments
python Example1.py

Two different examples are provided. These examples allow for modifications and customizations, enabling detailed exploration of various data aspects.


Setting Buffer Size in PATE

Given the context of time series data, selecting a buffer size for a fair evaluation of anomaly detectors' performance is unavoidable. The buffer parameter of PATE can be set using the following strategies:

  • Expert Knowledge: Best suited for customized, specific, and real-world applications where expert knowledge is available, or when one has enough experience with the data at hand. Experts can directly specify buffer sizes that are optimized for the particular use case.

  • ACF Analysis: Automatically determines the optimal buffer size by analyzing the autocorrelation within the data. This function is available in PATE_utils.py.

  • Range of Buffer Sizes: PATE is flexible and can evaluate performance across all combinations of pre and post buffer sizes, allowing for a comprehensive assessment without expert input. One can start with a maximum buffer size, and PATE automatically divides it into a specified number of ranges (determined by the user).

  • Default Setting: Utilizes the input window size of the anomaly detector, a standard, practical buffer size that aligns with the general scale of the data being analyzed. This option is useful when no specific adjustments are needed or when minimal configuration is desired.

This guidance ensures that you can effectively implement these buffer size selection strategies in PATE for optimal results.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pate-0.1.0.tar.gz (13.3 kB view hashes)

Uploaded Source

Built Distribution

PATE-0.1.0-py3-none-any.whl (13.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page