Causal network discovery using optimal causation entropy
Project description
CausationEntropy
A Python library for discovering causal networks from time series data using Optimal Causation Entropy (oCSE).
Overview
CausationEntropy implements state-of-the-art information-theoretic methods for causal discovery from multivariate time series. The library provides robust algorithms that can identify causal relationships while controlling for confounding variables and false discoveries.
What it does
Given time series data, CausationEntropy finds which variables cause changes in other variables by:
- Predictive Testing: Testing if knowing variable X at time t helps predict variable Y at time t+1
- Information Theory: Using conditional mutual information to measure predictive relationships
- Statistical Control: Rigorous statistical testing to avoid false discoveries
- Multiple Methods: Supporting various information estimators and discovery algorithms
Installation
From PyPI (recommended)
pip install causationentropy
Development Installation
git clone https://github.com/Center-For-Complex-Systems-Science/causationentropy.git
cd causationentropy
pip install -e .
Run the tests
python -m pytest causationentropy/tests/ --cov=causationentropy --cov-report=xml --cov-report=term-missing -v
Quick Start
See our Quick Start colab notebook:
Basic Usage
Get the relationships as a data frame:
import pandas as pd
from causationentropy import discover_network
from causationentropy.graph import network_to_dataframe
# Load your time series data (variables as columns, time as rows)
data = pd.read_csv('data.csv')
# Discover causal network
network = discover_network(data, method='standard', max_lag=5)
df = network_to_dataframe(network)
df.head()
Plot the causal network:
from causationentropy import discover_network
from causationentropy.core.plotting import plot_causal_network
# Load your time series data (variables as columns, time as rows)
data = pd.read_csv('data.csv')
# Discover causal network
network = discover_network(data, method='standard', max_lag=5)
fig, ax = plot_causal_network(network, save_path="network.png")
Note: This implementation of this algorithm runs in O(n^2 T log T) where N is the number of variables and T is the length of the time series. Application of this algorithm without optimizations is computationally intensive. When running this algorithm, please be patient. Optimizations of the algorithm are planned for a later release that leverage singular value decomposition and KD-Trees. However, these optimizations are not part of the original algorithm. Adding additional lags also contributes to additional performance degradations.
Advanced Configuration
from causationentropy import discover_network
# Configure discovery parameters
network = discover_network(
data,
method='standard', # 'standard', 'alternative', 'information_lasso', or 'lasso'
information='gaussian', # 'gaussian', 'knn', 'kde', 'geometric_knn', or 'poisson'
max_lag=5, # Maximum time lag to consider
alpha_forward=0.05, # Forward selection significance
alpha_backward=0.05, # Backward elimination significance
n_shuffles=200 # Permutation test iterations
)
Synthetic Data Example
from causationentropy.datasets import synthetic
from causationentropy import discover_network
# Generate synthetic causal time series
data, true_network = synthetic.linear_stochastic_gaussian_process(
n_variables=5,
n_samples=1000,
sparsity=0.3
)
# Discover network
discovered = discover_network(data)
Key Features
- Multiple Algorithms: Standard, alternative, information lasso, and lasso variants of oCSE
- Flexible Information Estimators: Gaussian, k-NN, KDE, geometric k-NN, and Poisson methods
- Statistical Rigor: Permutation-based significance testing with comprehensive test coverage
- Synthetic Data: Built-in generators for testing and validation
- Visualization: Network plotting and analysis tools
Mathematical Foundation
The algorithm uses conditional mutual information to quantify causal relationships:
$$I(X; Y | Z) = H(X | Z) + H(Y | Z) - H(X, Y | Z)$$
This measures how much variable X tells us about variable Y, beyond what we already know from conditioning set Z.
Causal Discovery Rule: Variable X causes Y if knowing X(t) significantly improves prediction of Y(t+1), even when controlling for all other relevant variables.
The algorithm implements a two-phase approach:
- Forward Selection: Iteratively adds predictors that maximize conditional mutual information
- Backward Elimination: Removes predictors that lose significance when conditioned on others
Documentation
📚 Read the full documentation on ReadTheDocs
- API Reference: Complete function and class documentation
- User Guide: Detailed tutorials and examples
- Theory: Mathematical background and algorithms
- Examples: Check the
notebooks/directory - Research Papers: See the
theory glossaryin the documentation
Local Documentation
Build documentation locally:
cd docs/
make html
# Open docs/_build/html/index.html
Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Citation
If you use this library in your research, please cite:
@misc{slote2025causationentropy,
author = {Slote, Kevin and Fish, Jeremie and Bollt, Erik},
title = {CausationEntropy: A Python Library for Causal Discovery},
url = {https://github.com/Center-For-Complex-Systems-Science/causationentropy},
doi = {10.5281/zenodo.17047565}
}
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: kslote1@gmail.com
Acknowledgments
This work builds upon fundamental research in information theory, causal inference, and time series analysis. Special thanks to the open-source scientific Python community.
LLM Disclosure
Generative AI was used to help with doc strings, documentation, and unit tests.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file causationentropy-1.1.0.tar.gz.
File metadata
- Download URL: causationentropy-1.1.0.tar.gz
- Upload date:
- Size: 83.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0eb401d5f7b8602d3105b0ff4c57e1ecb5b0590f86ac9afc1b37278bafab544b
|
|
| MD5 |
c18274649885920f9d9a0020dd986721
|
|
| BLAKE2b-256 |
8796f8ef74b13b0214a7fd170c81fbb1deba04bb0567c7e887e72002c1626991
|
Provenance
The following attestation bundles were made for causationentropy-1.1.0.tar.gz:
Publisher:
release.yml on Center-For-Complex-Systems-Science/causationentropy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
causationentropy-1.1.0.tar.gz -
Subject digest:
0eb401d5f7b8602d3105b0ff4c57e1ecb5b0590f86ac9afc1b37278bafab544b - Sigstore transparency entry: 696924733
- Sigstore integration time:
-
Permalink:
Center-For-Complex-Systems-Science/causationentropy@5835ff39b716dd05416b8fc03a08284d7a556306 -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/Center-For-Complex-Systems-Science
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5835ff39b716dd05416b8fc03a08284d7a556306 -
Trigger Event:
push
-
Statement type:
File details
Details for the file causationentropy-1.1.0-py3-none-any.whl.
File metadata
- Download URL: causationentropy-1.1.0-py3-none-any.whl
- Upload date:
- Size: 91.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5b80bc53081d652660ce9e4f222b2147ddb651e354e875864ae644fcfb7d2b8
|
|
| MD5 |
4621caee3f44e58d609aca07f4e956e8
|
|
| BLAKE2b-256 |
822c5414b678d6debeddd3bab38a52d2d2ecde1bcc89533ea665c1d1b7393c4b
|
Provenance
The following attestation bundles were made for causationentropy-1.1.0-py3-none-any.whl:
Publisher:
release.yml on Center-For-Complex-Systems-Science/causationentropy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
causationentropy-1.1.0-py3-none-any.whl -
Subject digest:
f5b80bc53081d652660ce9e4f222b2147ddb651e354e875864ae644fcfb7d2b8 - Sigstore transparency entry: 696924743
- Sigstore integration time:
-
Permalink:
Center-For-Complex-Systems-Science/causationentropy@5835ff39b716dd05416b8fc03a08284d7a556306 -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/Center-For-Complex-Systems-Science
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5835ff39b716dd05416b8fc03a08284d7a556306 -
Trigger Event:
push
-
Statement type: