Skip to main content

Tools for anomaly detection in time series based on Topological Data Analysis

Project description

ConfianceAI Logo

TDAAD




TDAAD – Topological Data Analysis for Anomaly Detection

Overview

TDAAD is a Python package for unsupervised anomaly detection in multivariate time series using Topological Data Analysis (TDA). Website and documentation: https://irt-systemx.github.io/tdaad/

It builds upon two powerful open-source libraries:

  • GUDHI GUDHI for efficient and scalable computation of persistent homology and topological features,
  • scikit-learn scikit-learn for core machine learning utilities like Pipeline and objects like EllipticEnvelope.

TDAAD implements the methodology introduced in:

Chazal, F., Levrard, C., & Royer, M. (2024). Topological Analysis for Detecting Anomalies (TADA) in dependent sequences: application to Time Series. Journal of Machine Learning Research, 25(365), 1–49. https://www.jmlr.org/papers/v25/24-0853.html

🔍 Features

  • Unsupervised anomaly detection in multivariate time series
  • Topological embedding using persistent homology
  • Scikit-learn–style API (fit, transform, score_samples)
  • Configurable embedding dimension, window size, and topological parameters
  • Works with NumPy arrays or pandas DataFrames

🛠 Installation

Install from PyPI (recommended):

pip install tdaad

Or install from source:

git clone https://github.com/IRT-SystemX/tdaad.git
cd tdaad
pip install .

Requirements:

  • Python ≥ 3.7
  • See requirements.txt for full dependency list

🚀 Quickstart

Here’s a minimal example using TopologicalAnomalyDetector:

import numpy as np
from tdaad.anomaly_detectors import TopologicalAnomalyDetector

# Example multivariate time series with shape (n_samples, n_features)
X = np.random.randn(1000, 3)

# Initialize and fit the detector
detector = TopologicalAnomalyDetector(window_size=100, n_centers_by_dim=3)
detector.fit(X)

# Compute anomaly scores
scores = detector.score_samples(X)

You can also use pandas.DataFrame instead of a NumPy array — column names will be preserved in the output.

For more advanced usage (e.g. custom embeddings, parameter tuning), see the examples folder or API documentation

📌 Usage Notes

  • TDAAD is designed for multivariate time series (2D inputs) — univariate data is not supported.
  • The core detection method relies on sliding-window embeddings and persistent homology to identify structural changes in the signal.
  • The key parameters that impact results and runtime are:
    • window_size controls the time resolution — larger windows capture slower anomalies, smaller ones detect more localized changes.
    • n_centers_by_dim controls the number of reference shapes used per homology dimension (e.g. connected components in H0, loops in H1, ...). Increasing this improves sensitivity but adds computation time.
    • tda_max_dim sets the maximum topological feature dimension computed (0 = connected components, 1 = loops, 2 = voids, ...). Higher values increase runtime and memory usage.
  • Inputs can be numpy.ndarray or pandas.DataFrame. Column names are preserved in the output when using DataFrames.

⚙️ You can typically handle ~100 sensors and a few hundred time steps per window on a modern machine.

🧮 Basic Complexity of Persistent Homology in TDAAD

  • Total complexity scales with: $O(N × (w × p)^{(d+2)})$ where $w$ is the time resolution (or window_size, number of time steps per window), $p$ is the number of variables (features/sensors), $d$ is the maximum homology dimension tda_max_dim, and $N$ is the total number of sliding windows.
  • So note that increasing max homology dimension d raises the exponent, causing exponential growth. The number of centers n_centers_by_dim used after the PH computation does not significantly affect the overall complexity.

📚 Documentation & Resources


Document generation

To regenerate the documentation, rerun the following commands from the project root, adapting if necessary:

pip install -r docs/docs_requirements.txt -r requirements.txt
sphinx-apidoc -o docs/source/generated tdaad
sphinx-build -M html docs/source docs/build -W --keep-going

Contributors and Support

This work has been supported by the French government under the "France 2030” program, as part of the SystemX Technological Research Institute within the Confiance.ai project. 

TDAAD is developed by IRT SystemX and supported by the European Trustworthy AI Association

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tdaad-1.6.0.tar.gz (16.8 kB view details)

Uploaded Source

File details

Details for the file tdaad-1.6.0.tar.gz.

File metadata

  • Download URL: tdaad-1.6.0.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tdaad-1.6.0.tar.gz
Algorithm Hash digest
SHA256 1dfbc2e2a28d4cbba5f072b774f6b447a3164a67b24ce197291ed741caca2c04
MD5 cd48e443ab90600587724e2f4a2ee1a1
BLAKE2b-256 b31ec76a7ea41352348183b3b1211e0f0fc221280e1dbe744b4503510ca230ea

See more details on using hashes here.

Provenance

The following attestation bundles were made for tdaad-1.6.0.tar.gz:

Publisher: python_lib_publish.yml on IRT-SystemX/tdaad

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page