Skip to main content

gCastle is the fundamental package for causal structure learning with Python.

Project description

gCastle

中文版本

Version 1.0.3 released.

We'll release Version 1.0.3 on 2022/08/08.

Introduction

gCastle is a causal structure learning toolchain developed by Huawei Noah's Ark Lab. The package contains various functionality related to causal learning and evaluation, including:

  • Data generation and processing: data simulation, data reading operators, and data pre-processing operators(such as prior injection and variable selection).
  • Causal structure learning: causal structure learning methods, including both classic and recently developed methods, especially gradient-based ones that can handle large problems.
  • Evaluation metrics: various commonly used metrics for causal structure learning, including F1, SHD, FDR, TPR, FDR, NNZ, etc.

Algorithm List

Algorithm Category Description Status
PC IID/Constraint-based A classic causal discovery algorithm based on conditional independence tests v1.0.3
ANM IID/Function-based Nonlinear causal discovery with additive noise models v1.0.3
DirectLiNGAM IID/Function-based A direct learning algorithm for linear non-Gaussian acyclic model (LiNGAM) v1.0.3
ICALiNGAM IID/Function-based An ICA-based learning algorithm for linear non-Gaussian acyclic model (LiNGAM) v1.0.3
GES IID/Score-based A classical Greedy Equivalence Search algorithm v1.0.3
PNL IID/Funtion-based Causal discovery based on the post-nonlinear causal assumption v1.0.3
NOTEARS IID/Gradient-based A gradient-based algorithm for linear data models (typically with least-squares loss) v1.0.3
NOTEARS-MLP IID/Gradient-based A gradient-based algorithm using neural network modeling for non-linear causal relationships v1.0.3
NOTEARS-SOB IID/Gradient-based A gradient-based algorithm using Sobolev space modeling for non-linear causal relationships v1.0.3
NOTEARS-lOW-RANK IID/Gradient-based Adapting NOTEARS for large problems with low-rank causal graphs v1.0.3
DAG-GNN IID/Gradient-based DAG Structure Learning with Graph Neural Networks v1.0.3
GOLEM IID/Gradient-based A more efficient version of NOTEARS that can reduce number of optimization iterations v1.0.3
GraNDAG IID/Gradient-based A gradient-based algorithm using neural network modeling for non-linear additive noise data v1.0.3
MCSL IID/Gradient-based A gradient-based algorithm for non-linear additive noise data by learning the binary adjacency matrix v1.0.3
GAE IID/Gradient-based A gradient-based algorithm using graph autoencoder to model non-linear causal relationships v1.0.3
RL IID/Gradient-based A RL-based algorithm that can work with flexible score functions (including non-smooth ones) v1.0.3
CORL IID/Gradient-based A RL- and order-based algorithm that improves the efficiency and scalability of previous RL-based approach v1.0.3
TTPM EventSequence/Function-based A causal structure learning algorithm based on Topological Hawkes process for spatio-temporal event sequences v1.0.3
HPCI EventSequence/Hybrid A causal structure learning algorithm based on Hawkes process and CI tests for event sequences under development.

Installation

Dependencies

gCastle requires:

  • python (>= 3.6, <=3.9)
  • tqdm (>= 4.48.2)
  • numpy (>= 1.19.1)
  • pandas (>= 0.22.0)
  • scipy (>= 1.7.3)
  • scikit-learn (>= 0.21.1)
  • matplotlib (>=2.1.2)
  • networkx (>= 2.5)
  • torch (>= 1.9.0)

PIP installation

pip install gcastle==1.0.3

Usage Example (PC algorithm)

from castle.common import GraphDAG
from castle.metrics import MetricsDAG
from castle.datasets import IIDSimulation, DAG
from castle.algorithms import PC

# data simulation, simulate true causal dag and train_data.
weighted_random_dag = DAG.erdos_renyi(n_nodes=10, n_edges=10, 
                                      weight_range=(0.5, 2.0), seed=1)
dataset = IIDSimulation(W=weighted_random_dag, n=2000, method='linear', 
                        sem_type='gauss')
true_causal_matrix, X = dataset.B, dataset.X

# structure learning
pc = PC()
pc.learn(X)

# plot predict_dag and true_dag
GraphDAG(pc.causal_matrix, true_causal_matrix, 'result')

# calculate metrics
mt = MetricsDAG(pc.causal_matrix, true_causal_matrix)
print(mt.metrics)

You can visit examples to find more examples.

Citation

If you find gCastle useful in your research, please consider citing the the following paper:

@misc{zhang2021gcastle,
  title={gCastle: A Python Toolbox for Causal Discovery}, 
  author={Keli Zhang and Shengyu Zhu and Marcus Kalander and Ignavier Ng and Junjian Ye and Zhitang Chen and Lujia Pan},
  year={2021},
  eprint={2111.15155},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

Next Up & Contributing

This is the first released version of gCastle, we'll be continuously complementing and optimizing the code and documentation. We welcome new contributors of all experience levels, the specifications about how to contribute code will be coming out soon. If you have any questions or suggestions (such as, contributing new algorithms, optimizing code, improving documentation), please submit an issue here. We will reply as soon as possible.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gcastle-1.0.3.tar.gz (129.3 kB view details)

Uploaded Source

Built Distribution

gcastle-1.0.3-py3-none-any.whl (214.9 kB view details)

Uploaded Python 3

File details

Details for the file gcastle-1.0.3.tar.gz.

File metadata

  • Download URL: gcastle-1.0.3.tar.gz
  • Upload date:
  • Size: 129.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for gcastle-1.0.3.tar.gz
Algorithm Hash digest
SHA256 063065fb3cf130b25611a9a7afc9de9e0665f80537bf95cd47bbc9bbaf10f6cb
MD5 32f6f89e2136608c03697412f3de8c96
BLAKE2b-256 37228505ddd00d21d8b4a93758bf637a177274c39dd6bc1a88e551b5ea87f8e7

See more details on using hashes here.

File details

Details for the file gcastle-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: gcastle-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 214.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for gcastle-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 703b38713e9459e4ca43a5547e83076b3a9a9b459f0279b84e953a6df71d74dc
MD5 ce9dfa25249222aff0eb26a55c87af42
BLAKE2b-256 241dc12be2c0f35d3e9360fe0a561c32fe1caf826ac469dcb87f18a873c0917f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page