Skip to main content

Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data

Project description

CDANs

License: MIT Python 3.9+

Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data.

[!NOTE] This repository previously contained a MATLAB implementation that accompanied the 2023 paper. That code had reproducibility issues and has been superseded by this Python library, which implements the same algorithm cleanly with full tests and documentation. The original MATLAB code is preserved at the v0.1-matlab-archive tag.

A Python implementation of the CDANs algorithm described in:

Ferdous, M. H., Hasan, U., & Gani, M. O. (2023). CDANs: Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data. Proceedings of the 8th Machine Learning for Healthcare Conference, PMLR 219. paper · arXiv

CDANs is a constraint-based causal discovery algorithm designed for time series that are both autocorrelated (today depends on yesterday) and non-stationary (the dependence pattern itself changes over time). It addresses three limitations of previous methods:

  1. High dimensionality — by using lagged parents as the conditioning set for contemporaneous CI tests, CDANs avoids conditioning on the full past.
  2. No lagged edges — most CD-NOD-style methods only recover contemporaneous structure; CDANs recovers both lagged and contemporaneous.
  3. Order dependence — the discovered skeleton does not depend on the order in which variables are listed in the input.

Installation

pip install cdans

For development:

git clone https://github.com/hferdous/CDANs.git
cd CDANs
pip install -e ".[dev]"

Optional extras:

Extra Adds
gp GPy for kernel-width learning
viz matplotlib, pydot for graph drawing
dev pytest, ruff, mypy, build
docs mkdocs and theme
all everything above

Quickstart

from cdans import CDANs
from cdans.utils import generate_synthetic_cdans

# 1. Generate a dataset with known ground truth.
dataset = generate_synthetic_cdans(
    n_vars=5, n_samples=500, tau_max=2, n_changing=2, seed=42,
)

# 2. Fit the model.
model = CDANs(tau_max=2, alpha=0.05, ci_test="fisherz")
result = model.fit(dataset.data)

# 3. Inspect the recovered graph.
print(result.summary())
print("Lagged edges:        ", result.graph.lagged_edges)
print("Contemp adjacency:\n", result.graph.contemp_adj)
print("Changing modules:    ", result.graph.changing_modules)

A runnable version is in examples/quickstart.py.

Plotting

Two graph visualizations are available via tigramite:

from cdans import plot_process_graph, plot_time_series_graph

plot_process_graph(result.graph, save_path="process.png")        # aggregate
plot_time_series_graph(result.graph, save_path="time_series.png")  # time-unrolled

Requires the viz extra: pip install "cdans[viz]". See examples/plotting.py for a complete worked example.

Evaluation

When you have ground truth (e.g. from generate_synthetic_cdans), score the recovered graph with standard structure-recovery metrics:

from cdans import CDANs, evaluate_graph
from cdans.utils import generate_synthetic_cdans

dataset = generate_synthetic_cdans(n_vars=5, n_samples=400, tau_max=2, seed=42)
result = CDANs(tau_max=2, ci_test="kci").fit(dataset.data)

metrics = evaluate_graph(result.graph, dataset)
print(metrics.summary())
print(f"SHD: {metrics.shd_total},  lagged TPR: {metrics.lagged.tpr:.2f},  FDR: {metrics.lagged.fdr:.2f}")

Reported separately for lagged edges, contemp skeleton, contemp directed, and changing modules: TP / FP / FN / precision / recall (= TPR) / FDR / F1, plus SHD per category and total.

API

The library is built around two layers:

High-level: CDANs class

CDANs(
    tau_max=2,                # max lag
    alpha=0.05,               # CI test significance level
    pc_alpha=0.2,             # looser alpha for PC step inside Step 1
    ci_test="fisherz",        # or "kci", or your own CITest instance
    surrogate="time",         # or pass an array of shape (n_samples,)
    max_extra_conds=2,        # contemp neighbors added to lagged conds
    use_independent_change=True,
    independent_change_width="auto",  # or a positive float
    verbose=False,
)

fit(data) returns a CDANsResult with:

  • .graphTimeSeriesGraph (lagged edges, contemporaneous adjacency, changing modules)
  • .timings — wall-clock per step
  • .config — the configuration used
  • .summary() — pretty-printed report

Low-level: cdans.steps submodule

The four steps can be invoked individually, which is useful for research and for swapping in your own implementations:

from cdans.steps import (
    discover_lagged_adjacencies,   # Step 1
    build_partial_graph,           # Step 2
    refine_skeleton,               # Step 3
    orient_edges,                  # Step 4
)

Each step takes a TimeSeriesGraph (or produces one for Step 1) and mutates it in place.

Algorithm overview

CDANs has four steps, each mapped to a section of the paper:

Step Paper § Module What it does
1 §3.1 cdans.steps.step1_lagged MCI tests find lagged adjacencies X_i[t-lag] -> X_j[t]
2 §3.2 cdans.steps.step2_partial_graph Build the partial graph: lagged + fully-connected contemporaneous + surrogate C
3 §3.3 cdans.steps.step3_skeleton The key step. Prune contemporaneous edges using lagged-parent conditioning sets, and confirm changing modules
4 §3.4 cdans.steps.step4_orient Orient contemp edges via v-structures + Meek's rules

The graph is encoded as a TimeSeriesGraph (see cdans.graph). Independence tests live in cdans.ci_tests and are pluggable.

Project layout

cdans-library/
├── pyproject.toml
├── README.md
├── LICENSE
├── CITATION.cff
├── src/cdans/
│   ├── __init__.py
│   ├── algorithm.py           # Top-level CDANs class
│   ├── steps/                 # The four algorithm steps
│   ├── ci_tests/              # CI test interface + implementations
│   ├── graph/                 # TimeSeriesGraph + Meek's rules
│   └── utils/                 # Lagging helpers + synthetic data
├── tests/
│   ├── test_lagging.py
│   ├── test_synthetic.py
│   ├── test_graph.py
│   ├── test_ci_tests.py
│   └── test_end_to_end.py
└── examples/
    └── quickstart.py

Status of features vs. the MATLAB reference

The reference MATLAB implementation lives in CDANs/OT/ of the parent repo. Status of the Python port:

Feature Status
Step 1: MCI-based lagged adjacencies
Step 2: partial graph construction
Step 3: optimized CI tests (lagged-parent conditioning)
Step 3: changing-module detection
Step 4: surrogate orientation C -> X
Step 4: v-structure orientation
Step 4: Meek's rules R1–R4
Synthetic data generator
Independent change principle for direction inference ✓ (infer_nonsta_dir.m non-GP path; plus a GP-bandwidth path via scikit-learn for the small-T regime)
Driving-force visualization (kPCA) ☐ (TODO; the MATLAB cd_non_con_fun.m is not yet ported)
MIMIC-III data preparation Out of scope (clinical data access required)

Running the tests

pip install -e ".[dev]"
pytest

Citing

If you use this software, please cite the paper:

@inproceedings{ferdous2023cdans,
  title     = {CDANs: Temporal Causal Discovery from Autocorrelated and Non-Stationary Time Series Data},
  author    = {Ferdous, Muhammad Hasan and Hasan, Uzma and Gani, Md Osman},
  booktitle = {Proceedings of the 8th Machine Learning for Healthcare Conference},
  series    = {Proceedings of Machine Learning Research},
  volume    = {219},
  year      = {2023},
  publisher = {PMLR},
  url       = {https://proceedings.mlr.press/v219/ferdous23a.html},
}

Acknowledgements

CDANs builds on prior work in constraint-based causal discovery, notably:

  • CD-NOD (Huang & Zhang, 2017) — provides the surrogate-variable construction and the changing-module formalism. The Python implementation in causal-learn was used as a reference when porting the KCI test that ships with this library.
  • PCMCI (Runge et al., 2019) — provides the MCI conditional independence framework that Step 1 implements. The original reference implementation is in tigramite; CDANs ships an independent self-contained Python port.
  • KCI (Zhang, Peters, Janzing & Schölkopf, 2011) — kernel-based conditional independence test. CDANs includes an in-house implementation in cdans.ci_tests.kci, validated to produce identical p-values to the causal-learn reference on a battery of synthetic test cases.
  • Meek's rules (Meek, 1995) — used in Step 4 for orientation propagation.

Documentation

A full documentation site is included. To build and serve it locally:

pip install "cdans[docs]"
mkdocs serve

The site has installation, quickstart, a step-by-step algorithm walkthrough, per-step deep dives, an auto-generated API reference, and a guide to reproducing the paper experiments.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cdans-0.1.0.tar.gz (63.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cdans-0.1.0-py3-none-any.whl (53.1 kB view details)

Uploaded Python 3

File details

Details for the file cdans-0.1.0.tar.gz.

File metadata

  • Download URL: cdans-0.1.0.tar.gz
  • Upload date:
  • Size: 63.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cdans-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d32291d92158b6c93a8da273ae64e56048f6d4d7ecedd162ae8116522b65480e
MD5 b4e44ccec8e39d07820323453359b6c7
BLAKE2b-256 d307a889ea26191d65b91f9854d126974bac0a8ebb21b1f8d008aba5110fccd9

See more details on using hashes here.

File details

Details for the file cdans-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: cdans-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 53.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for cdans-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3405327284aba247dc345496b79ba78b919d41609e52e6aeaa91040f6389f9f8
MD5 12de8317728a0edd41470985e943e768
BLAKE2b-256 571df445fe7e0b2913ba1f537e804e9777a38217b09ec92fb24039a61bdf121c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page