This package identifies outlier(s) for a given time-series dataset in simple steps. It supports day, week, month and quarter level time-series data.
Project description
PyCatcher
Outlier Detection for Time-series Data
This package identifies outlier(s) for a given time-series dataset in simple steps. It supports day, week, month and quarter level time-series data.
Installation
pip install pycatcher
DataFrame Arguments
- First column in the dataframe must be a date column ('YYYY-MM-DD') and the last column a numeric column (sum or total count for the time period) to detect outliers using Seasonal Decomposition algorithms.
- Last column must be a numeric column to detect outliers using Moving Average and Z-score algorithm.
Summary of features
PyCatcher provides an efficient solution for detecting anomalies in time-series data using various statistical methods. Below are the available techniques for anomaly detection, each optimized for different data characteristics.
1. Seasonal-Decomposition Based Anomaly Detection
Detect Outliers Using Classical Seasonal Decomposition
For datasets with at least two years of data, PyCatcher automatically determines whether the data follows an additive or multiplicative model to detect anomalies.
- Method:
detect_outliers(df) - Output: DataFrame of detected anomalies or a message indicating no anomalies.
Detect Today's Outliers
Quickly identify if there are any anomalies specifically for the current date.
- Method:
detect_outliers_today(df) - Output: Anomaly details for today or a message indicating no outliers.
Detect the Latest Anomalies
Retrieve the most recent anomalies identified in your time-series data.
- Method:
detect_outliers_latest(df) - Output: Details of the latest detected anomalies.
Visualize Outliers with Seasonal Decomposition
Show outliers in your data through classical seasonal decomposition.
- Method:
build_classical_seasonal_outliers_plot(df) - Output: Outlier plot generated using classical seasonal decomposition.
Visualize Seasonal Patterns
Understand seasonality in your data by visualizing trends through decomposition.
- Method:
build_seasonal_plot(df) - Output: Seasonal plots displaying additive or multiplicative trends.
Visualize Monthly Patterns
Show month-wise box plot
- Method:
build_monthwise_plot(df) - Output: Month-wise box plots showing spread and skewness of data.
Detect Outliers Using Seasonal-Trend Decomposition using LOESS (STL)
Use the Seasonal-Trend Decomposition method (STL) to detect anomalies.
- Method:
detect_outliers_stl(df) - Output: Rows flagged as outliers using STL.
Visualize STL Outliers
Show outliers using the Seasonal-Trend Decomposition using LOESS (STL).
- Method:
build_stl_outliers_plot(df) - Output: Outlier plot generated using STL.
2. Detect Outliers Using Moving Average
Detect anomalies in time-series data using the Moving Average method.
- Method:
detect_outliers_moving_average(df) - Output: Rows flagged as outliers using Moving Average and Z-score algorithm.
Visualize Moving Average Outliers
Show outliers using the Moving Average and Z-score algorithm.
- Method:
build_moving_average_outliers_plot(df) - Output: Outlier plot generated using Moving Average method.
3. IQR-Based Anomaly Detection
Detect Outliers Using Interquartile Range (IQR)
For datasets spanning less than two years, the IQR method is employed.
- Method:
find_outliers_iqr(df) - Output: Rows flagged as outliers based on IQR.
Visualize IQR Plot
Build an IQR plot for a given dataframe (for less than 2 years of data).
- Method:
build_iqr_plot(df) - Output: IQR plot for the time-series data.
Summary of Package Functions
detect_outliers(df):Detect outliers in a time-series dataframe using seasonal trend decomposition when there is at least 2 years of data, otherwise we can use Inter Quartile Range (IQR) for smaller timeframe.detect_outliers_today(df):Detect outliers for the current date in a time-series dataframe.detect_outliers_latest(df):Detect latest outliers in a time-series dataframe.detect_outliers_iqr(df):Detect outliers in a time-series dataframe when there's less than 2 years of data.detect_outliers_moving_average(df):Detect outliers using moving average method.detect_outliers_stl(df):Detect outliers using Seasonal-Trend Decomposition using LOESS (STL).
Summary of Diagnostic Plots
build_seasonal_plot(df):Build seasonal plot (additive or multiplicative) for a given dataframe.build_iqr_plot(df):Build IQR plot for a given dataframe (for less than 2 years of data).build_monthwise_plot(df):Build month-wise plot for a given dataframe.build_decomposition_results(df):Get seasonal decomposition results for a given dataframe.build_classical_seasonal_outliers_plot(df):Show outliers using Classical Seasonal Decomposition algorithm.build_moving_average_outliers_plot(df):Show outliers using Moving Average and Z-score algorithm.build_stl_outliers_plot(df):Show outliers using Seasonal-Trend Decomposition using LOESS (STL).conduct_stationarity_check(df):Conduct stationarity checks for a feature (dataframe's count column).
Highlights
Unlike many open-source packages for outlier detection, PyCatcher provides several distinctive features:
- Automatic Model Selection: PyCatcher automatically detects whether to use an additive or multiplicative decomposition model, ensuring the most accurate detection of outliers based on the characteristics of your data.
- Dynamic Method Selection Based on Data Size: PyCatcher seamlessly switches between Seasonal Trend Decomposition (for datasets spanning at least two years) and Inter Quartile Range (IQR) for shorter time periods, offering flexibility without manual intervention.
- Wide Time Frequency Support: Supports multiple time-series frequencies — including daily, weekly, monthly, and quarterly data—without requiring users to pre-process or adjust their datasets.
- Choice for Different Seasonal Trend Algorithms: Support for outlier detection using both Classical Seasonal Trend Decomposition and Seasonal-Trend Decomposition using LOESS (STL) algorithms.
- Integrated Diagnostics: PyCatcher includes comprehensive diagnostic tools, enabling users to visualize outliers, trends and seasonal patterns, evaluate data stationarity, and analyze decomposition results.
- User Interface: Availability of a simple user interface for the users to upload file for outlier detection using IQR. Future versions will include advanced algorithms.
Example Usage
To see an example of how to use the pycatcher package for outlier detection in time-series data, check out the Example Notebook.
The notebook provides step-by-step guidance and demonstrates the key features of the library.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pycatcher-0.0.41.tar.gz.
File metadata
- Download URL: pycatcher-0.0.41.tar.gz
- Upload date:
- Size: 20.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.2.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
175103c4568cc4f16618aed75b4b7edac0e40cf2a7369c09df6ce1b3aca66e2d
|
|
| MD5 |
6aa9b7eafd73586b7c5496f77b9b0220
|
|
| BLAKE2b-256 |
0a45156c98e33ad02c79e7f1f1ae89f0696a45e6485beb6333dcc83c2556b086
|
File details
Details for the file pycatcher-0.0.41-py3-none-any.whl.
File metadata
- Download URL: pycatcher-0.0.41-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Darwin/23.2.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5525f2dc6103a3063581cfb93eed2a7c60c4bb310140c902751e80e27963eeb0
|
|
| MD5 |
3a896c6080d021cb7b66d201f401d9f5
|
|
| BLAKE2b-256 |
f4d1217fded7da9f9cfae755f05b98c510a5587c089a6b50ff0d01c0a78c302c
|