Skip to main content

A comprehensive Python library for detecting anomalies in time series and multivariate data

Project description

Anomaly Detection Toolkit

A comprehensive Python library for detecting anomalies in time series and multivariate data using multiple detection methods including statistical, machine learning, and deep learning approaches.

Python 3.12+ License: MIT Documentation

📚 Full Documentation

Features

  • Statistical Methods: Z-score, IQR, seasonal baseline detection
  • Machine Learning: Isolation Forest, Local Outlier Factor (LOF), Robust Covariance
  • Wavelet Methods: Wavelet decomposition and denoising for time series
  • Deep Learning: LSTM and PyTorch autoencoders for anomaly detection
  • Ensemble Methods: Voting and score combination ensembles
  • Easy to Use: Scikit-learn compatible API
  • Well Documented: Comprehensive docstrings and examples

Installation

Basic Installation

pip install anomaly-detection-toolkit

With Deep Learning Support

For LSTM and PyTorch autoencoders:

pip install anomaly-detection-toolkit[deep]

Development Installation

git clone https://github.com/kylejones200/anomaly-detection-toolkit.git
cd anomaly-detection-toolkit
pip install -e ".[deep]"

Building Documentation

pip install -e ".[docs]"
cd docs
make html

Quick Start

Statistical Methods

from anomaly_detection_toolkit import ZScoreDetector, IQROutlierDetector
import numpy as np

# Generate sample data
data = np.random.randn(1000)
data[100:105] += 5  # Inject anomalies

# Z-score detector
detector = ZScoreDetector(n_std=3.0)
detector.fit(data)
predictions, scores = detector.fit_predict(data)

print(f"Anomalies detected: {(predictions == -1).sum()}")

Machine Learning Methods

from anomaly_detection_toolkit import IsolationForestDetector, LOFDetector
import pandas as pd

# Load your data
df = pd.read_csv('your_data.csv')
features = ['feature1', 'feature2', 'feature3']
X = df[features]

# Isolation Forest
iso_detector = IsolationForestDetector(contamination=0.05, n_estimators=200)
iso_detector.fit(X)
predictions, scores = iso_detector.fit_predict(X)

# Local Outlier Factor
lof_detector = LOFDetector(contamination=0.05, n_neighbors=20)
lof_detector.fit(X)
predictions, scores = lof_detector.fit_predict(X)

Time Series Anomaly Detection

Wavelet-Based Detection

from anomaly_detection_toolkit import WaveletDetector
import pandas as pd

# Time series data
df = pd.read_csv('time_series.csv', parse_dates=['date'])
ts = df['value'].values

# Wavelet detector
wavelet_detector = WaveletDetector(wavelet='db4', threshold_factor=2.5, level=5)
wavelet_detector.fit(ts)
predictions, scores = wavelet_detector.fit_predict(ts)

Seasonal Baseline Detection

from anomaly_detection_toolkit import SeasonalBaselineDetector

# DataFrame with date and value columns
df = pd.DataFrame({
    'date': pd.date_range('2020-01-01', periods=365, freq='D'),
    'value': np.random.randn(365) * 10 + 50
})

# Seasonal baseline detector (weekly seasonality)
seasonal_detector = SeasonalBaselineDetector(seasonality='week', threshold_sigma=2.5)
seasonal_detector.fit(df, date_col='date', value_col='value')
predictions = seasonal_detector.predict(df, date_col='date', value_col='value')

Deep Learning Methods

LSTM Autoencoder

from anomaly_detection_toolkit import LSTMAutoencoderDetector
import numpy as np

# Time series data
ts = np.sin(np.linspace(0, 50, 1000)) + np.random.randn(1000) * 0.1
ts[450:460] += 3  # Inject anomalies

# LSTM autoencoder
lstm_detector = LSTMAutoencoderDetector(
    window_size=20,
    lstm_units=[32, 16],
    epochs=50,
    threshold_std=3.0
)
lstm_detector.fit(ts)
predictions, scores = lstm_detector.fit_predict(ts)

PyTorch Autoencoder

from anomaly_detection_toolkit import PyTorchAutoencoderDetector

# PyTorch autoencoder
pytorch_detector = PyTorchAutoencoderDetector(
    window_size=24,
    hidden_dims=[64, 16, 4],
    epochs=200,
    threshold_std=3.0
)
pytorch_detector.fit(ts)
predictions, scores = pytorch_detector.fit_predict(ts)

Ensemble Methods

from anomaly_detection_toolkit import (
    IsolationForestDetector,
    LOFDetector,
    RobustCovarianceDetector,
    VotingEnsemble
)

# Create multiple detectors
detectors = [
    IsolationForestDetector(contamination=0.05),
    LOFDetector(contamination=0.05),
    RobustCovarianceDetector(contamination=0.05)
]

# Voting ensemble (flags if 2+ detectors agree)
ensemble = VotingEnsemble(detectors, voting_threshold=2)
ensemble.fit(X)
predictions, scores = ensemble.fit_predict(X)

API Reference

Statistical Methods

  • ZScoreDetector: Z-score based anomaly detection
  • IQROutlierDetector: Interquartile Range (IQR) based outlier detection
  • SeasonalBaselineDetector: Seasonal baseline anomaly detection for time series

Machine Learning Methods

  • IsolationForestDetector: Isolation Forest anomaly detection
  • LOFDetector: Local Outlier Factor (LOF) anomaly detection
  • RobustCovarianceDetector: Robust Covariance (Elliptic Envelope) anomaly detection

Wavelet Methods

  • WaveletDetector: Wavelet-based anomaly detection for time series
  • WaveletDenoiser: Wavelet-based signal denoising

Deep Learning Methods

  • LSTMAutoencoderDetector: LSTM autoencoder-based anomaly detection (requires TensorFlow/Keras)
  • PyTorchAutoencoderDetector: PyTorch autoencoder-based anomaly detection (requires PyTorch)

Ensemble Methods

  • VotingEnsemble: Voting ensemble that combines predictions from multiple detectors
  • EnsembleDetector: General ensemble detector with customizable combination methods

Examples

See the examples/ directory for complete examples:

  • examples/statistical_example.py: Statistical methods
  • examples/ml_example.py: Machine learning methods
  • examples/time_series_example.py: Time series anomaly detection
  • examples/ensemble_example.py: Ensemble methods

Development

Setting Up Pre-commit Hooks

This project uses pre-commit hooks to ensure code quality before commits and pushes:

# Install pre-commit hooks
./setup-pre-commit.sh

# Or manually:
pip install pre-commit
pre-commit install
pre-commit install --hook-type pre-push

The pre-push hooks will automatically:

  • Check code formatting (Black)
  • Sort imports (isort)
  • Lint code (flake8)
  • Type check (mypy)
  • Security scan (bandit)
  • Run tests (pytest)

To run checks manually:

pre-commit run --all-files

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use this library in your research, please cite:

@software{anomaly_detection_toolkit,
  title={Anomaly Detection Toolkit},
  author={Kyle Jones},
  year={2025},
  url={https://github.com/kylejones200/anomaly-detection-toolkit}
}

Acknowledgments

  • Built with scikit-learn, PyWavelets, and other excellent open-source libraries
  • Inspired by various anomaly detection research and implementations

Support

For issues, questions, or contributions, please open an issue on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anomaly_detection_toolkit-0.1.1.tar.gz (19.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

anomaly_detection_toolkit-0.1.1-py3-none-any.whl (19.1 kB view details)

Uploaded Python 3

File details

Details for the file anomaly_detection_toolkit-0.1.1.tar.gz.

File metadata

File hashes

Hashes for anomaly_detection_toolkit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6e87468e01403a8e3c4376c8774115c61f64c9bfe5879990fce7a29bde3ad71f
MD5 e3ec6a0bbafbbff361f5262d1ba4a26a
BLAKE2b-256 075593600f892dcf8c4b7cc68ebdd86252c82a28bda919f34a9245106fb54e3b

See more details on using hashes here.

Provenance

The following attestation bundles were made for anomaly_detection_toolkit-0.1.1.tar.gz:

Publisher: publish.yml on kylejones200/anomaly-detection-toolkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file anomaly_detection_toolkit-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for anomaly_detection_toolkit-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3e5b2d60b28114be3e583df1631662d109b3ec238d7bd8c1297e3acad71ff0a7
MD5 737f77995fab96e74ef2445f50522392
BLAKE2b-256 9060ecb2f0800ef617a84f823082332c24d1de23ffed2e3ee8904f548c4c98b5

See more details on using hashes here.

Provenance

The following attestation bundles were made for anomaly_detection_toolkit-0.1.1-py3-none-any.whl:

Publisher: publish.yml on kylejones200/anomaly-detection-toolkit

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page