Skip to main content

PATE: Proximity-Aware Time series anomaly Evaluation metric

Project description

PATE: Proximity-Aware Time series anomaly Evaluation

ACM KDD 2024 Accepted Preprint Version

This repository contains the code for PATE (Proximity-Aware Time series anomaly Evaluation measure), a novel evaluation metric for assessing anomaly detection in time series data. PATE introduces proximity-based weighting and computes a weighted version of the area under the Precision-Recall curve, offering a more accurate and fair evaluation by considering the temporal relationship between predicted and actual anomalies. The methodology is detailed in our paper, showcasing its effectiveness through experiments with both synthetic and real-world datasets.

Quick Start

Installation

Install PATE for immediate use in your projects:

pip install PATE

How to use PATE?

Utilizing PATE is straightforward. Begin by importing the PATE module in your Python script:

from pate.PATE_metric import PATE

Prepare your input as arrays of anomaly scores (continues or binary) and binary labels. PATE allows for comprehensive customization of parameters, enabling easy toggling between PATE and PATE-F1 evaluations. Please refer to the main code documentation for a full list of configurable options.

Example usage of PATE and PATE-F1:

pate = PATE(labels, anomaly_scores, binary_scores = False)
pate_f1 = PATE(labels, binary_anomaly_scores, binary_scores = True)

Basic Example

import numpy as np
from pate.PATE_metric import PATE

# Example data setup
labels = np.array([0, 1, 0, 1, 0])
scores = np.array([0.1, 0.8, 0.1, 0.9, 0.2])

# Initialize PATE and compute the metric
pate = PATE(labels, scores, binary_scores = False)
print(pate)

Advanced Setup and Experiments

For researchers interested in reproducing the experiments or exploring the evaluation metric further with various data sets:

Environment Setup

To use PATE, start by creating and activating a new Conda environment using the following commands:

conda create --name pate_env python=3.8
conda activate pate_env

Install Dependencies

Install the required Python packages via:

git clone https://github.com/raminghorbanii/PATE
cd PATE
pip install -r synthetic_exp_requirements.txt

Conducting Experiments

with Synthetic Data

To run experiments on synthetic data, navigate to the experiments/Synthetic_Data_Experiments directory and execute the main Python script. This script allows for the modification of various scenarios, comparing PATE and PATE-F1 against other established metrics.

cd experiments/Synthetic_Data_Experiments
python main_synthetic_data.py

Example of how you use PATE using synthetic data (Binary detector):

from utils_Synthetic_exp import evaluate_all_metrics, synthetic_generator

label_anomaly_ranges = [[40,59]] # You can selec multiple ranges for anomaly. Here we selected one range with the size of 20 points (A_k) 
predicted_ranges = [[30, 49]]  # You can selec multiple ranges for predictions. Here we selected the range the same as Scenario 2, proposed in the original paper. 
vus_zone_size = e_buffer = d_buffer = 20 

experiment_results = synthetic_generator(label_anomaly_ranges, predicted_ranges, vus_zone_size, e_buffer, d_buffer)
predicted_array = experiment_results["predicted_array"]
label_array = experiment_results["label_array"]


score_list_simple = evaluate_all_metrics(predicted_array, label_array, vus_zone_size, e_buffer, d_buffer)
print(score_list_simple)
Output:

'original_F1Score': 0.5,
'pa_precision': 0.67,
'pa_recall': 1.0,
'pa_f_score': 0.8,
'Rbased_precision': 0.6,
'Rbased_recall': 0.6,
'Rbased_f1score': 0.6,
'eTaPR_precision': 0.75,
'eTaPR_recall': 0.75,
'eTaPR_f1_score': 0.75,
'Affiliation precision': 0.97,
'Affiliation recall': 0.99,
'Affliation F1score': 0.98,
'VUS_ROC': 0.79,
'VUS_PR': 0.72,
'AUC': 0.74,
'AUC_PR': 0.51,

'PATE': 0.76,
'PATE-F1': 0.75}

with Real-World Data

For real-world data experiments, ensure all additional required packages are installed.

pip install -r Real_exp_requirements.txt

Download the Dataset

The datasets for these experiments can be downloaded from the following link:

Dataset Link: https://www.thedatum.org/datasets/TSB-UAD-Public.zip

Ref: This dataset is made available through the GitHub page of the project "An End-to-End Benchmark Suite for Univariate Time-Series Anomaly Detection (TSB-UAD)": https://github.com/TheDatumOrg/TSB-UAD

Running the Experiments

After downloading, place the unzipped dataset in the same directory. If you store the data in a different location, ensure you update the directory paths in the code to match.

Navigate to the experiments/RealWorld_Data_Experiments directory to run an experiment. Execute one of the example Python scripts by entering the following command:

cd experiments/RealWorld_Data_Experiments
python Example1.py

Two different examples are provided. These examples allow for modifications and customizations, enabling detailed exploration of various data aspects.


Setting Buffer Size in PATE

Given the context of time series data, selecting a buffer size for a fair evaluation of anomaly detectors' performance is unavoidable. The buffer parameter of PATE can be set using the following strategies:

  • Expert Knowledge: Best suited for customized, specific, and real-world applications where expert knowledge is available, or when one has enough experience with the data at hand. Experts can directly specify buffer sizes that are optimized for the particular use case.

  • ACF Analysis: Automatically determines the optimal buffer size by analyzing the autocorrelation within the data. This function is available in PATE_utils.py.

  • Range of Buffer Sizes: PATE is flexible and can evaluate performance across all combinations of pre and post buffer sizes, allowing for a comprehensive assessment without expert input. One can start with a maximum buffer size, and PATE automatically divides it into a specified number of ranges (determined by the user).

  • Default Setting: Utilizes the input window size of the anomaly detector, a standard, practical buffer size that aligns with the general scale of the data being analyzed. This option is useful when no specific adjustments are needed or when minimal configuration is desired.

This guidance ensures that you can effectively implement these buffer size selection strategies in PATE for optimal results.


Citation

If you use PATE in your research or in any project, we kindly request that you cite the PATE paper:



          

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pate-0.1.1.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

PATE-0.1.1-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file pate-0.1.1.tar.gz.

File metadata

  • Download URL: pate-0.1.1.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.10

File hashes

Hashes for pate-0.1.1.tar.gz
Algorithm Hash digest
SHA256 d9d569e1c4a3b4732e21c29c9d55b8ff40abe5d2bcfa5187f91b136b092fd03d
MD5 c8e20611a45ef7e2595adf53f4900100
BLAKE2b-256 3f1aac1b67ea7b8823ec8975c7e74a96ce28d4461aaaa31ffe1cee53629a0df1

See more details on using hashes here.

File details

Details for the file PATE-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: PATE-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 14.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.10

File hashes

Hashes for PATE-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4a6fc563d7ea0d75fdb8fa02039a0fe8df7b88475b1cbceffce58654cd7da1dc
MD5 fa953a99fc5b1bd9dd5c1081f620e170
BLAKE2b-256 080fbebf6ba78145e668dfa1394a7c7e591872543db460ba8903e6a8781084ec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page