Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data
Project description
CDANs
Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data.
[!NOTE] This repository previously contained a MATLAB implementation that accompanied the 2023 paper. That code had reproducibility issues and has been superseded by this Python library, which implements the same algorithm cleanly with full tests and documentation. The original MATLAB code is preserved at the
v0.1-matlab-archivetag.
A Python implementation of the CDANs algorithm described in:
Ferdous, M. H., Hasan, U., & Gani, M. O. (2023). CDANs: Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data. Proceedings of the 8th Machine Learning for Healthcare Conference, PMLR 219. paper · arXiv
CDANs is a constraint-based causal discovery algorithm designed for time series that are both autocorrelated (today depends on yesterday) and non-stationary (the dependence pattern itself changes over time). It addresses three limitations of previous methods:
- High dimensionality — by using lagged parents as the conditioning set for contemporaneous CI tests, CDANs avoids conditioning on the full past.
- No lagged edges — most CD-NOD-style methods only recover contemporaneous structure; CDANs recovers both lagged and contemporaneous.
- Order dependence — the discovered skeleton does not depend on the order in which variables are listed in the input.
Installation
pip install cdans
For development:
git clone https://github.com/hferdous/CDANs.git
cd CDANs
pip install -e ".[dev]"
Optional extras:
| Extra | Adds |
|---|---|
gp |
GPy for kernel-width learning |
viz |
matplotlib, pydot for graph drawing |
dev |
pytest, ruff, mypy, build |
docs |
mkdocs and theme |
all |
everything above |
Quickstart
from cdans import CDANs
from cdans.utils import generate_synthetic_cdans
# 1. Generate a dataset with known ground truth.
dataset = generate_synthetic_cdans(
n_vars=5, n_samples=500, tau_max=2, n_changing=2, seed=42,
)
# 2. Fit the model.
model = CDANs(tau_max=2, alpha=0.05, ci_test="fisherz")
result = model.fit(dataset.data)
# 3. Inspect the recovered graph.
print(result.summary())
print("Lagged edges: ", result.graph.lagged_edges)
print("Contemp adjacency:\n", result.graph.contemp_adj)
print("Changing modules: ", result.graph.changing_modules)
A runnable version is in examples/quickstart.py.
Plotting
Two graph visualizations are available via tigramite:
from cdans import plot_process_graph, plot_time_series_graph
plot_process_graph(result.graph, save_path="process.png") # aggregate
plot_time_series_graph(result.graph, save_path="time_series.png") # time-unrolled
Requires the viz extra: pip install "cdans[viz]". See
examples/plotting.py for a complete worked example.
Evaluation
When you have ground truth (e.g. from generate_synthetic_cdans), score
the recovered graph with standard structure-recovery metrics:
from cdans import CDANs, evaluate_graph
from cdans.utils import generate_synthetic_cdans
dataset = generate_synthetic_cdans(n_vars=5, n_samples=400, tau_max=2, seed=42)
result = CDANs(tau_max=2, ci_test="kci").fit(dataset.data)
metrics = evaluate_graph(result.graph, dataset)
print(metrics.summary())
print(f"SHD: {metrics.shd_total}, lagged TPR: {metrics.lagged.tpr:.2f}, FDR: {metrics.lagged.fdr:.2f}")
Reported separately for lagged edges, contemp skeleton, contemp directed, and changing modules: TP / FP / FN / precision / recall (= TPR) / FDR / F1, plus SHD per category and total.
API
The library is built around two layers:
High-level: CDANs class
CDANs(
tau_max=2, # max lag
alpha=0.05, # CI test significance level
pc_alpha=0.2, # looser alpha for PC step inside Step 1
ci_test="fisherz", # or "kci", or your own CITest instance
surrogate="time", # or pass an array of shape (n_samples,)
max_extra_conds=2, # contemp neighbors added to lagged conds
use_independent_change=True,
independent_change_width="auto", # or a positive float
verbose=False,
)
fit(data) returns a CDANsResult with:
.graph—TimeSeriesGraph(lagged edges, contemporaneous adjacency, changing modules).timings— wall-clock per step.config— the configuration used.summary()— pretty-printed report
Low-level: cdans.steps submodule
The four steps can be invoked individually, which is useful for research and for swapping in your own implementations:
from cdans.steps import (
discover_lagged_adjacencies, # Step 1
build_partial_graph, # Step 2
refine_skeleton, # Step 3
orient_edges, # Step 4
)
Each step takes a TimeSeriesGraph (or produces one for Step 1) and mutates
it in place.
Algorithm overview
CDANs has four steps, each mapped to a section of the paper:
| Step | Paper § | Module | What it does |
|---|---|---|---|
| 1 | §3.1 | cdans.steps.step1_lagged |
MCI tests find lagged adjacencies X_i[t-lag] -> X_j[t] |
| 2 | §3.2 | cdans.steps.step2_partial_graph |
Build the partial graph: lagged + fully-connected contemporaneous + surrogate C |
| 3 | §3.3 | cdans.steps.step3_skeleton |
The key step. Prune contemporaneous edges using lagged-parent conditioning sets, and confirm changing modules |
| 4 | §3.4 | cdans.steps.step4_orient |
Orient contemp edges via v-structures + Meek's rules |
The graph is encoded as a TimeSeriesGraph (see cdans.graph).
Independence tests live in cdans.ci_tests and are pluggable.
Project layout
cdans-library/
├── pyproject.toml
├── README.md
├── LICENSE
├── CITATION.cff
├── src/cdans/
│ ├── __init__.py
│ ├── algorithm.py # Top-level CDANs class
│ ├── steps/ # The four algorithm steps
│ ├── ci_tests/ # CI test interface + implementations
│ ├── graph/ # TimeSeriesGraph + Meek's rules
│ └── utils/ # Lagging helpers + synthetic data
├── tests/
│ ├── test_lagging.py
│ ├── test_synthetic.py
│ ├── test_graph.py
│ ├── test_ci_tests.py
│ └── test_end_to_end.py
└── examples/
└── quickstart.py
Status of features vs. the MATLAB reference
The reference MATLAB implementation lives in CDANs/OT/ of the parent repo.
Status of the Python port:
| Feature | Status |
|---|---|
| Step 1: MCI-based lagged adjacencies | ✓ |
| Step 2: partial graph construction | ✓ |
| Step 3: optimized CI tests (lagged-parent conditioning) | ✓ |
| Step 3: changing-module detection | ✓ |
Step 4: surrogate orientation C -> X |
✓ |
| Step 4: v-structure orientation | ✓ |
| Step 4: Meek's rules R1–R4 | ✓ |
| Synthetic data generator | ✓ |
| Independent change principle for direction inference | ✓ (infer_nonsta_dir.m non-GP path; plus a GP-bandwidth path via scikit-learn for the small-T regime) |
| Driving-force visualization (kPCA) | ☐ (TODO; the MATLAB cd_non_con_fun.m is not yet ported) |
| MIMIC-III data preparation | Out of scope (clinical data access required) |
Running the tests
pip install -e ".[dev]"
pytest
Citing
If you use this software, please cite the paper:
@inproceedings{ferdous2023cdans,
title = {CDANs: Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data},
author = {Ferdous, Muhammad Hasan and Hasan, Uzma and Gani, Md Osman},
booktitle = {Proceedings of the 8th Machine Learning for Healthcare Conference},
series = {Proceedings of Machine Learning Research},
volume = {219},
year = {2023},
publisher = {PMLR},
url = {https://proceedings.mlr.press/v219/ferdous23a.html},
}
Acknowledgements
CDANs builds on prior work in constraint-based causal discovery, notably:
- CD-NOD (Huang & Zhang, 2017) — provides the surrogate-variable
construction and the changing-module formalism. The Python implementation
in
causal-learnwas used as a reference when porting the KCI test that ships with this library. - PCMCI (Runge et al., 2019) — provides the MCI conditional
independence framework that Step 1 implements. The original reference
implementation is in
tigramite; CDANs ships an independent self-contained Python port. - KCI (Zhang, Peters, Janzing & Schölkopf, 2011) — kernel-based
conditional independence test. CDANs includes an in-house implementation
in
cdans.ci_tests.kci, validated to produce identical p-values to thecausal-learnreference on a battery of synthetic test cases. - Meek's rules (Meek, 1995) — used in Step 4 for orientation propagation.
Documentation
A full documentation site is included. To build and serve it locally:
pip install "cdans[docs]"
mkdocs serve
The site has installation, quickstart, a step-by-step algorithm walkthrough, per-step deep dives, an auto-generated API reference, and a guide to reproducing the paper experiments.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cdans-0.1.0.tar.gz.
File metadata
- Download URL: cdans-0.1.0.tar.gz
- Upload date:
- Size: 63.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d32291d92158b6c93a8da273ae64e56048f6d4d7ecedd162ae8116522b65480e
|
|
| MD5 |
b4e44ccec8e39d07820323453359b6c7
|
|
| BLAKE2b-256 |
d307a889ea26191d65b91f9854d126974bac0a8ebb21b1f8d008aba5110fccd9
|
File details
Details for the file cdans-0.1.0-py3-none-any.whl.
File metadata
- Download URL: cdans-0.1.0-py3-none-any.whl
- Upload date:
- Size: 53.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3405327284aba247dc345496b79ba78b919d41609e52e6aeaa91040f6389f9f8
|
|
| MD5 |
12de8317728a0edd41470985e943e768
|
|
| BLAKE2b-256 |
571df445fe7e0b2913ba1f537e804e9777a38217b09ec92fb24039a61bdf121c
|