The Causal Inference & Econometrics Toolkit for Python

These details have not been verified by PyPI

Project links

Project description

StatsPAI: The Causal Inference & Econometrics Toolkit for Python

StatsPAI is a Python package for causal inference and applied econometrics. It provides a unified, Stata-style API covering the complete empirical research workflow — from estimation to publication-ready tables in Word, Excel, and LaTeX.

It brings R's Causal Inference Task View (fixest, did, rdrobust, gsynth, DoubleML, MatchIt, CausalImpact) into a single, consistent Python package.

Built by the team behind CoPaper.AI · Stanford REAP Program

Main Features

Regression Models:

Ordinary Least Squares with robust / clustered / HAC standard errors
Instrumental Variables / Two-Stage Least Squares (2SLS), with first-stage F, Sargan, and Hausman tests
Panel data: Fixed Effects, Random Effects, Between, First Differences (via linearmodels)
High-dimensional Fixed Effects (via pyfixest)

Causal Inference — Difference-in-Differences:

Classic 2x2 DID estimator
Staggered DID with heterogeneous treatment effects (Callaway & Sant'Anna 2021)
Event study plots and pre-trend tests

Causal Inference — Regression Discontinuity:

Sharp and Fuzzy RD with local polynomial estimation
MSE-optimal bandwidth selection (CCT 2014)
Robust bias-corrected confidence intervals
RD plots with binned scatter and polynomial fit

Causal Inference — Matching:

Propensity Score Matching (logit-based PSM)
Mahalanobis distance matching
Coarsened Exact Matching (CEM)
Balance diagnostics with standardized mean differences

Causal Inference — Synthetic Control:

Abadie-Diamond-Hainmueller SCM
Penalized / ridge SCM for many donors
Placebo (permutation) inference with MSPE ratios
Donor weight tables and gap plots

Causal Inference — Machine Learning Methods:

Double/Debiased Machine Learning: Partially Linear (PLR) and Interactive (IRM) models with cross-fitting (Chernozhukov et al. 2018)
Causal Forest for heterogeneous treatment effects (HTE)
Compatible with any scikit-learn estimator as first-stage ML model

Causal Inference — Other Methods:

Causal Impact: Bayesian structural time-series intervention analysis (Brodersen et al. 2015)
Causal Mediation Analysis: ACME / ADE decomposition with bootstrap inference (Imai et al. 2010)
Shift-Share / Bartik IV with Rotemberg weight diagnostics (GPSS 2020)

Post-Estimation:

Marginal effects (AME / MEM) with delta-method standard errors, equivalent to Stata's margins, dydx(*)
Wald test for linear restrictions, equivalent to Stata's test
Linear combinations of coefficients with inference, equivalent to Stata's lincom

Diagnostics:

Oster (2019) coefficient stability / selection-on-unobservables bounds
McCrary (2008) density manipulation test for RD validity

Publication-Quality Output:

Multi-model comparison tables (equivalent to R's modelsummary / Stata's esttab)
Coefficient forest plots across models
Summary statistics tables (equivalent to Stata's tabstat)
Balance tables for matching / DID / RCT papers
Cross-tabulation with chi-squared / Fisher's exact test (equivalent to Stata's tab, chi2)
Export to Word (.docx), Excel (.xlsx), LaTeX (.tex), HTML — all tables, all formats
Every result object has .summary(), .plot(), .to_latex(), .to_docx(), .cite()

Installation

pip install statspai

With optional dependencies:

pip install statspai[plotting]    # matplotlib, seaborn
pip install statspai[fixest]      # pyfixest for high-dimensional FE

Requirements: Python >= 3.9

Core dependencies: NumPy, SciPy, Pandas, statsmodels, scikit-learn, linearmodels, patsy, openpyxl, python-docx

Quick Example

import statspai as sp

# --- Estimation ---
r1 = sp.regress("wage ~ education + experience", data=df, robust='hc1')
r2 = sp.ivreg("wage ~ (education ~ parent_edu) + experience", data=df)
r3 = sp.did(df, y='wage', treat='policy', time='year', id='worker')
r4 = sp.rdrobust(df, y='score', x='running_var', c=0)
r5 = sp.match(df, y='outcome', treat='treated', covariates=['age', 'edu'])
r6 = sp.dml(df, y='wage', treat='training', covariates=['age', 'edu', 'exp'])

# --- Post-estimation ---
me = sp.margins(r1, data=df)            # Marginal effects
sp.test(r1, "education = experience")   # Wald test: beta_edu = beta_exp?
sp.lincom(r1, "education + experience") # Linear combination

# --- Tables (to Word / Excel / LaTeX) ---
sp.modelsummary(r1, r2, output='table2.docx')
sp.outreg2(r1, r2, r3, filename='results.xlsx')
sp.sumstats(df, vars=['wage', 'education', 'age'], output='table1.docx')
sp.balance_table(df, treat='treated', covariates=['age', 'edu'], output='balance.docx')
sp.tab(df, 'treatment', 'outcome', output='crosstab.docx')

API Summary

Category	Functions	Description
Regression	`regress`, `ivreg`, `panel`, `fixest.feols`	OLS, IV/2SLS, Panel (FE/RE/FD/BE), High-dimensional FE
DID	`did`, `did_2x2`, `callaway_santanna`	Classic 2x2, Staggered (C&S 2021), Event study
RD	`rdrobust`, `rdplot`	Sharp/Fuzzy RD, CCT robust inference, RD plots
Matching	`match`	PSM, CEM, Mahalanobis, Balance diagnostics
Synth	`synth`	Abadie SCM, Penalized SCM, Placebo inference
ML Causal	`dml`, `causal_forest`	Double ML (PLR/IRM), Causal Forest (HTE)
Other Causal	`causal_impact`, `mediate`, `bartik`	Intervention analysis, Mediation, Shift-share IV
Post-estimation	`margins`, `marginsplot`, `test`, `lincom`	Marginal effects, Wald tests, Linear combinations
Diagnostics	`oster_bounds`, `mccrary_test`	Coefficient stability, Density manipulation
Tables	`modelsummary`, `outreg2`, `sumstats`, `balance_table`, `tab`	Multi-model tables, Summary stats, Balance, Cross-tabs
Plots	`coefplot`, `marginsplot`, `rdplot`, `result.plot()`	Coefficient, Margins, RD, Event study plots
Export	`.to_docx()`, `.to_latex()`, `output='*.xlsx'`	Word, Excel, LaTeX, HTML — all tables, all formats

All causal methods return a unified CausalResult object:

result.estimate       # Point estimate
result.se             # Standard error
result.pvalue         # P-value
result.ci             # Confidence interval
result.summary()      # Formatted text summary
result.plot()         # Appropriate visualization
result.to_latex()     # LaTeX table
result.to_docx()      # Word document
result.cite()         # BibTeX citation for the method

Comparison with Stata and R

Task	Stata	R	StatsPAI
OLS with robust SE	`reg y x, r`	`feols(y ~ x, vcov="HC1")`	`sp.regress("y ~ x", robust='hc1')`
IV regression	`ivregress 2sls y (x = z)`	`feols(y ~ 1 \| x ~ z)`	`sp.ivreg("y ~ (x ~ z)")`
Staggered DID	`csdid y, ivar(id) time(t) gvar(g)`	`att_gt(y ~ 1, ...)`	`sp.did(df, y, treat, time, id)`
RD design	`rdrobust y x, c(0)`	`rdrobust(Y, X, c=0)`	`sp.rdrobust(df, y, x, c=0)`
PSM matching	`psmatch2 treat x1 x2`	`matchit(treat ~ x1+x2)`	`sp.match(df, y, treat, covs)`
Double ML	—	`DoubleML$new(...)`	`sp.dml(df, y, treat, covs)`
Marginal effects	`margins, dydx(*)`	`margins(model)`	`sp.margins(result, data=df)`
Wald test	`test x1 = x2`	`linearHypothesis(...)`	`sp.test(result, "x1 = x2")`
Export to Word	`outreg2 using r.doc, word`	`modelsummary(output="t.docx")`	`sp.outreg2(r, filename="r.docx")`
Summary stats	`tabstat y x, s(mean sd)`	`datasummary(...)`	`sp.sumstats(df, vars=[...])`

About

StatsPAI Inc. is the research infrastructure company behind CoPaper.AI — the AI co-authoring platform for empirical research, born out of Stanford's REAP program.

CoPaper.AI — Upload your data, set your research question, and produce a fully reproducible academic paper with code, tables, and formatted output. Powered by StatsPAI under the hood. copaper.ai

Team:

Bryce Wang — Founder. Economics, Finance, CS & AI. Stanford REAP.
Dr. Scott Rozelle — Co-founder & Strategic Advisor. Stanford Senior Fellow, author of Invisible China.

Contributing

git clone https://github.com/brycewang-stanford/statspai.git
cd statspai
pip install -e ".[dev,plotting,fixest]"
pytest

Citation

@software{wang2025statspai,
  title={StatsPAI: The Causal Inference & Econometrics Toolkit for Python},
  author={Wang, Bryce},
  year={2025},
  url={https://github.com/brycewang-stanford/statspai},
  version={0.1.0}
}

License

MIT License. See LICENSE.

GitHub · PyPI · Documentation · CoPaper.AI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.7.0

Apr 26, 2026

1.6.6

Apr 24, 2026

1.6.5

Apr 24, 2026

1.6.4

Apr 24, 2026

1.6.3

Apr 24, 2026

1.6.2

Apr 24, 2026

1.6.1

Apr 23, 2026

1.6.0

Apr 22, 2026

1.5.1

Apr 22, 2026

1.5.0

Apr 21, 2026

1.4.2

Apr 21, 2026

1.4.1

Apr 21, 2026

1.4.0

Apr 21, 2026

1.3.0

Apr 21, 2026

1.0.1

Apr 21, 2026

0.9.16

Apr 21, 2026

0.9.3

Apr 20, 2026

0.9.2

Apr 17, 2026

0.9.1

Apr 17, 2026

0.9.0

Apr 16, 2026

0.8.0

Apr 16, 2026

0.7.1

Apr 15, 2026

0.7.0

Apr 15, 2026

0.6.2

Apr 13, 2026

0.6.1

Apr 8, 2026

0.6.0

Apr 6, 2026

0.5.1

Apr 5, 2026

0.5.0

Apr 5, 2026

0.4.0

Apr 5, 2026

0.3.1

Apr 4, 2026

0.3.0

Apr 4, 2026

This version

0.2.0

Apr 4, 2026

0.1.0

Jul 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

statspai-0.2.0.tar.gz (128.6 kB view details)

Uploaded Apr 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

statspai-0.2.0-py3-none-any.whl (124.0 kB view details)

Uploaded Apr 4, 2026 Python 3

File details

Details for the file statspai-0.2.0.tar.gz.

File metadata

Download URL: statspai-0.2.0.tar.gz
Upload date: Apr 4, 2026
Size: 128.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for statspai-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`4af67ae34d057d6b3bf954c5050936f6e20ec22c20d5b795be4686095116e572`
MD5	`5a479bb54ea8ebb9ef956a863615efae`
BLAKE2b-256	`249a1801a0d7edfd502faeef02f6c84bc857a5713f04fe45925961538100f77c`

See more details on using hashes here.

File details

Details for the file statspai-0.2.0-py3-none-any.whl.

File metadata

Download URL: statspai-0.2.0-py3-none-any.whl
Upload date: Apr 4, 2026
Size: 124.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for statspai-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aa75bd01948e96f3608fec575dc70160992fe240397176df8eea6e8868304953`
MD5	`64fd54e1830cfbd7139819b859f869b9`
BLAKE2b-256	`d59714b76188d82b7ce8e733727904c4059387ee821da726bd829929cc0509be`

See more details on using hashes here.

StatsPAI 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

StatsPAI: The Causal Inference & Econometrics Toolkit for Python

Main Features

Installation

Quick Example

API Summary

Comparison with Stata and R

About

Contributing

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes