A library for estimates of causal effects.
Project description
CausalEstimate
CausalEstimate is a Python library designed for causal inference, providing a suite of methods to estimate treatment effects from observational data. It includes doubly robust techniques such as Targeted Maximum Likelihood Estimation (TMLE), alongside propensity score-based methods like inverse probability weighting (IPW) and matching. The library is built for flexibility and ease of use, integrating seamlessly with pandas and supporting bootstrap-based standard error estimation and multiple estimators in one pass.
Features
- Causal inference methods: IPW, AIPW, TMLE, Matching, etc.
- Supports multiple effect types: ATE, ATT, Risk Ratio, etc.
- Bootstrap standard error estimation and confidence intervals
- Common-support filtering and matching (greedy, optimal)
- Plotting utilities for distribution checks (e.g., propensity score overlap)
Installation
pip install CausalEstimate
Or for local development:
git clone https://github.com/kirilklein/CausalEstimate.git
cd CausalEstimate
pip install -e .
Usage
1) Single Estimator Usage
You can import any estimator class (e.g., IPW, AIPW, TMLE) and call compute_effect(df) directly. Columns (treatment, outcome, propensity score) are passed to the estimator in its constructor.
import numpy as np
import pandas as pd
from CausalEstimate.estimators import IPW
# Simulate data
np.random.seed(42)
n = 1000
ps = np.random.uniform(0, 1, n) # true propensity for treatment
treatment = np.random.binomial(1, ps) # actual treatment assignment
outcome = 2 + 0.5 * treatment + np.random.normal(0, 1, n)
df = pd.DataFrame({
"ps": ps,
"treatment": treatment,
"outcome": outcome
})
# Create an IPW Estimator for ATE
ipw_estimator = IPW(
effect_type="ATE",
treatment_col="treatment",
outcome_col="outcome",
ps_col="ps",
# optionally stabilized=True if you want stabilized IP weights
)
results = ipw_estimator.compute_effect(df)
print("IPW estimated effect:", results)
results here is simply a floating-point effect estimate for a single-sample run (no bootstrap). If you want bootstrapping in a single pass, see the MultiEstimator below.
2) Multi Estimator Usage
If you want to run multiple estimators (e.g., IPW, TMLE, AIPW) on the same dataset in one pass—optionally applying bootstrap or common-support filtering—you can use MultiEstimator.
from CausalEstimate.estimators import IPW, AIPW, TMLE, MultiEstimator
ipw = IPW(effect_type="ATE", treatment_col="treatment", outcome_col="outcome", ps_col="ps")
aipw = AIPW(effect_type="ATE", treatment_col="treatment", outcome_col="outcome", ps_col="ps",
probas_t1_col="predicted_outcome_treated", probas_t0_col="predicted_outcome_control")
tmle = TMLE(effect_type="ATE", treatment_col="treatment", outcome_col="outcome", ps_col="ps",
probas_col="predicted_outcome", probas_t1_col="predicted_outcome_treated",
probas_t0_col="predicted_outcome_control")
multi_estimator = MultiEstimator([ipw, aipw, tmle])
# Apply bootstrap, common support, etc.
results = multi_estimator.compute_effects(
df,
bootstrap=True,
n_bootstraps=50,
apply_common_support=True,
common_support_threshold=0.05,
)
print(results)
results will be a dictionary like:
{
"IPW": {"effect": ..., "std_err": ..., "bootstrap": True, ...},
"AIPW": {"effect": ..., "std_err": ..., ...},
"TMLE": {...},
}
3) Matching
The library supports both optimal and greedy (a.k.a. eager) matching. For example:
import pandas as pd
import numpy as np
from CausalEstimate.matching import match_optimal, match_eager
df = pd.DataFrame({
"PID": [101, 102, 103, 202, 203, 204],
"treatment": [1, 1, 1, 0, 0, 0],
"ps": [0.30, 0.35, 0.90, 0.31, 0.34, 0.85],
})
# Optimal matching (with caliper=0.05, 1 control per treated)
matched_optimal = match_optimal(df, n_controls=1, caliper=0.05,
treatment_col="treatment", ps_col="ps", pid_col="PID")
print("Optimal Matching Results:")
print(matched_optimal)
# Eager (greedy) matching
matched_eager = match_eager(df, caliper=0.05, treatment_col="treatment", ps_col="ps", pid_col="PID")
print("Eager Matching Results:")
print(matched_eager)
Both functions return a DataFrame of matched pairs (or sets), typically with columns like [treated_pid, control_pid, distance].
4) Plotting
CausalEstimate provides basic plotting utilities to visualize distributions of propensity scores or predicted outcome probabilities across treatment vs. control.
Example: Propensity Score Distribution
📌 Generated from this notebook
import matplotlib.pyplot as plt
from CausalEstimate.vis.plotting import plot_propensity_score_dist, plot_outcome_proba_dist
# Suppose df has columns "ps", "treatment", and "predicted_outcome"
fig, ax = plot_propensity_score_dist(df, ps_col="ps", treatment_col="treatment")
plt.show()
fig, ax = plot_outcome_proba_dist(df, outcome_proba_col="predicted_outcome", treatment_col="treatment")
plt.show()
Development
See CONTRIBUTING.md for details on setting up a dev environment, running tests, and contributing to this project.
License
CausalEstimate is licensed under the MIT License. See LICENSE for more details.
Contact
- GitHub: kirilklein
- Email: kikl@di.ku.dk
Please open issues or pull requests if you find any bugs or want to propose enhancements.
Citation
If you use CausalEstimate in your research, please cite it using the following BibTeX entry:
@software{causalestimate,
author = {Kiril Klein, ...},
title = {CausalEstimate: A Python Library for Causal Inference},
year = {2024},
url = {https://github.com/kirilklein/CausalEstimate},
version = {X.Y.Z},
note = {GitHub repository}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file causalestimate-0.7.1.tar.gz.
File metadata
- Download URL: causalestimate-0.7.1.tar.gz
- Upload date:
- Size: 94.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26ca41952a48319dc607ddbf5a60281283295bfd50e3ef8f756c11ce88784769
|
|
| MD5 |
22137901f621e1b5d8761b2e15857967
|
|
| BLAKE2b-256 |
9d95a289a62d8c4ea47c661cc6ffa58073578cf891902aa5eace2eb893703b07
|
Provenance
The following attestation bundles were made for causalestimate-0.7.1.tar.gz:
Publisher:
publish.yml on kirilklein/CausalEstimate
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
causalestimate-0.7.1.tar.gz -
Subject digest:
26ca41952a48319dc607ddbf5a60281283295bfd50e3ef8f756c11ce88784769 - Sigstore transparency entry: 181125950
- Sigstore integration time:
-
Permalink:
kirilklein/CausalEstimate@5daa60f97079a4ad3680f90263922df801b2fb15 -
Branch / Tag:
refs/tags/v0.7.1 - Owner: https://github.com/kirilklein
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5daa60f97079a4ad3680f90263922df801b2fb15 -
Trigger Event:
push
-
Statement type:
File details
Details for the file causalestimate-0.7.1-py3-none-any.whl.
File metadata
- Download URL: causalestimate-0.7.1-py3-none-any.whl
- Upload date:
- Size: 34.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3fdfa3c1802ec1c7b31242ac34ec6ef7c0108ffb8e74d958bc39c09b135216b7
|
|
| MD5 |
1993efbdc00dd7f56f205bfacd290a2e
|
|
| BLAKE2b-256 |
5edb879094bc5abb8a1706f5e3b6ce6c00fad6d6340321c5fe559c79ccc9a1de
|
Provenance
The following attestation bundles were made for causalestimate-0.7.1-py3-none-any.whl:
Publisher:
publish.yml on kirilklein/CausalEstimate
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
causalestimate-0.7.1-py3-none-any.whl -
Subject digest:
3fdfa3c1802ec1c7b31242ac34ec6ef7c0108ffb8e74d958bc39c09b135216b7 - Sigstore transparency entry: 181125952
- Sigstore integration time:
-
Permalink:
kirilklein/CausalEstimate@5daa60f97079a4ad3680f90263922df801b2fb15 -
Branch / Tag:
refs/tags/v0.7.1 - Owner: https://github.com/kirilklein
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@5daa60f97079a4ad3680f90263922df801b2fb15 -
Trigger Event:
push
-
Statement type: