Tools for anomaly detection in time series based on Topological Data Analysis
Project description
TDAAD
TDAAD – Topological Data Analysis for Anomaly Detection
Overview
TDAAD is a Python package for unsupervised anomaly detection in multivariate time series using Topological Data Analysis (TDA). Website and documentation: https://irt-systemx.github.io/tdaad/
It builds upon two powerful open-source libraries:
GUDHI for efficient and scalable computation of persistent homology and topological features,
scikit-learn for core machine learning utilities like
Pipelineand objects likeEllipticEnvelope.
TDAAD implements the methodology introduced in:
Chazal, F., Levrard, C., & Royer, M. (2024). Topological Analysis for Detecting Anomalies (TADA) in dependent sequences: application to Time Series. Journal of Machine Learning Research, 25(365), 1–49. https://www.jmlr.org/papers/v25/24-0853.html
🔍 Features
- Unsupervised anomaly detection in multivariate time series
- Topological embedding using persistent homology
- Scikit-learn–style API (
fit,transform,score_samples) - Configurable embedding dimension, window size, and topological parameters
- Works with NumPy arrays or pandas DataFrames
🛠 Installation
Install from PyPI (recommended):
pip install tdaad
Or install from source:
git clone https://github.com/IRT-SystemX/tdaad.git
cd tdaad
pip install .
Requirements:
- Python ≥ 3.7
- See
requirements.txtfor full dependency list
🚀 Quickstart
Here’s a minimal example using TopologicalAnomalyDetector:
import numpy as np
from tdaad.anomaly_detectors import TopologicalAnomalyDetector
# Example multivariate time series with shape (n_samples, n_features)
X = np.random.randn(1000, 3)
# Initialize and fit the detector
detector = TopologicalAnomalyDetector(window_size=100, n_centers_by_dim=3)
detector.fit(X)
# Compute anomaly scores
scores = detector.score_samples(X)
You can also use pandas.DataFrame instead of a NumPy array — column names will be preserved in the output.
For more advanced usage (e.g. custom embeddings, parameter tuning), see the examples folder or API documentation
📌 Usage Notes
- TDAAD is designed for multivariate time series (2D inputs) — univariate data is not supported.
- The core detection method relies on sliding-window embeddings and persistent homology to identify structural changes in the signal.
- The key parameters that impact results and runtime are:
window_sizecontrols the time resolution — larger windows capture slower anomalies, smaller ones detect more localized changes.n_centers_by_dimcontrols the number of reference shapes used per homology dimension (e.g. connected components in H0, loops in H1, ...). Increasing this improves sensitivity but adds computation time.tda_max_dimsets the maximum topological feature dimension computed (0 = connected components, 1 = loops, 2 = voids, ...). Higher values increase runtime and memory usage.
- Inputs can be
numpy.ndarrayorpandas.DataFrame. Column names are preserved in the output when using DataFrames.
⚙️ You can typically handle ~100 sensors and a few hundred time steps per window on a modern machine.
🧮 Basic Complexity of Persistent Homology in TDAAD
- Total complexity scales with: $
O(N × (w × p)^{(d+2)})$ where $w$ is the time resolution (orwindow_size, number of time steps per window), $p$ is the number of variables (features/sensors), $d$ is the maximum homology dimensiontda_max_dim, and $N$ is the total number of sliding windows. - So note that increasing max homology dimension
draises the exponent, causing exponential growth. The number of centersn_centers_by_dimused after the PH computation does not significantly affect the overall complexity.
📚 Documentation & Resources
Document generation
To regenerate the documentation, rerun the following commands from the project root, adapting if necessary:
pip install -r docs/docs_requirements.txt -r requirements.txt
sphinx-apidoc -o docs/source/generated tdaad
sphinx-build -M html docs/source docs/build -W --keep-going
Contributors and Support
This work has been supported by the French government under the "France 2030” program, as part of the SystemX Technological Research Institute within the Confiance.ai project.
TDAAD is developed by IRT SystemX and supported by the European Trustworthy AI Association
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file tdaad-1.6.0.tar.gz.
File metadata
- Download URL: tdaad-1.6.0.tar.gz
- Upload date:
- Size: 16.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1dfbc2e2a28d4cbba5f072b774f6b447a3164a67b24ce197291ed741caca2c04
|
|
| MD5 |
cd48e443ab90600587724e2f4a2ee1a1
|
|
| BLAKE2b-256 |
b31ec76a7ea41352348183b3b1211e0f0fc221280e1dbe744b4503510ca230ea
|
Provenance
The following attestation bundles were made for tdaad-1.6.0.tar.gz:
Publisher:
python_lib_publish.yml on IRT-SystemX/tdaad
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tdaad-1.6.0.tar.gz -
Subject digest:
1dfbc2e2a28d4cbba5f072b774f6b447a3164a67b24ce197291ed741caca2c04 - Sigstore transparency entry: 908737128
- Sigstore integration time:
-
Permalink:
IRT-SystemX/tdaad@d103a2cb5b48ec435afff9b7f75185ff2da6ca3f -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IRT-SystemX
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python_lib_publish.yml@d103a2cb5b48ec435afff9b7f75185ff2da6ca3f -
Trigger Event:
workflow_dispatch
-
Statement type: