Python replication of Stata's bacondecomp command — Goodman-Bacon (2021) decomposition of TWFE DiD estimators
Project description
bacondecomp
Bacon Decomposition of Two-Way Fixed Effects Difference-in-Differences
A Python implementation of the Goodman-Bacon (2021) decomposition, which expresses any two-way fixed effects (TWFE) DiD estimator as a weighted average of all possible 2×2 DiD comparisons. Supports uncontrolled, ddetail, and controlled (FWL) decompositions with optional multi-core parallelism via joblib.
Installation
pip install pybacondecomp
Dependencies: numpy, pandas, pyfixest ≥ 0.25
Optional: joblib (parallel execution), tqdm (progress bars), matplotlib (plots)
Background
In staggered adoption designs, the TWFE estimator is a weighted average of all 2×2 DiD comparisons between pairs of timing groups. Some of these comparisons use already-treated units as the "control" group, which can produce negative weights when treatment effects are heterogeneous across groups or over time.
The decomposition identifies three types of comparisons:
| Type | Description |
|---|---|
| Timing groups | Earlier-adopting group vs. later-adopting group (and vice versa) |
| Never vs. timing | Timing group vs. never-treated units |
| Always vs. timing | Timing group vs. always-treated units |
| Within | Within-group variation (controlled decomposition only) |
ddetail mode further splits timing-group comparisons into:
- Early vs. Late — earlier-adopting group treated, later-adopting group as not-yet-treated control
- Late vs. Early — later-adopting group treated, earlier-adopting group as already-treated control
Citation
This package is a Python port of the Stata command bacondecomp (v1.0.5, Goodman-Bacon, Goldring & Nichols, 2022). Please cite the original paper when using this package:
Goodman-Bacon, Andrew. "Difference-in-differences with variation in treatment timing." Journal of Econometrics 225, no. 2 (2021): 254–277. https://doi.org/10.1016/j.jeconom.2021.03.014
The original working paper version:
Goodman-Bacon, Andrew. "Difference-in-differences with variation in treatment timing." NBER Working Paper No. 25018, 2018. https://www.nber.org/papers/w25018
BibTeX:
@article{goodman-bacon2021,
author = {Goodman-Bacon, Andrew},
title = {Difference-in-differences with variation in treatment timing},
journal = {Journal of Econometrics},
volume = {225},
number = {2},
pages = {254--277},
year = {2021},
doi = {10.1016/j.jeconom.2021.03.014}
}
The Stata implementation this port is based on:
Goodman-Bacon, Andrew, Thomas Goldring, and Austin Nichols.
bacondecomp: Stata module to perform Bacon decomposition of difference-in-differences estimation. Statistical Software Components S458676, Boston College Department of Economics, 2022. https://ideas.repec.org/c/boc/bocode/s458676.html
Usage
Basic (no controls)
import pandas as pd
from pybacondecomp import bacondecomp
result = bacondecomp(
df,
y = "outcome", # outcome variable
tr = "treat", # binary treatment (0/1, weakly increasing)
unit = "state", # panel unit identifier
time = "year", # time variable
)
print(result.dd_estimate) # overall TWFE estimate
print(result.summary) # weighted average by comparison type
print(result.two_by_two) # every 2×2 DiD comparison
ddetail mode — split Early vs. Late
from pybacondecomp import bacondecomp
result = bacondecomp(df, y="outcome", tr="treat",
unit="state", time="year",
ddetail=True)
Controlled decomposition (FWL)
from pybacondecomp import bacondecomp
result = bacondecomp(df, y="outcome", tr="treat",
unit="state", time="year",
x=["log_income", "unemp_rate"])
Parallel execution
from pybacondecomp import bacondecomp
result = bacondecomp(df, y="outcome", tr="treat",
unit="state", time="year",
n_jobs=-1) # use all cores
Plot
from pybacondecomp import bacon_plot
fig = bacon_plot(result)
fig.savefig("bacon.png", dpi=150)
Stata-style interface
from pybacondecomp import bacondecomp_stata
result = bacondecomp_stata(df, "outcome treat log_income unemp_rate",
unit="state", time="year")
API Reference
bacondecomp(df, y, tr, unit, time, x=None, weights=None, ddetail=False, n_jobs=1, verbose=True)
| Parameter | Type | Default | Description |
|---|---|---|---|
df |
pd.DataFrame |
— | Strongly balanced panel |
y |
str |
— | Outcome variable |
tr |
str |
— | Binary treatment (0/1, weakly increasing) |
unit |
str |
— | Panel unit identifier |
time |
str |
— | Time variable |
x |
list[str] |
None |
Control variables (triggers FWL decomposition) |
weights |
str |
None |
Analytic weight variable |
ddetail |
bool |
False |
Split timing-group comparisons into Early/Late |
n_jobs |
int |
1 |
Parallel workers (-1 = all cores); requires joblib |
verbose |
bool |
True |
Print progress and summary |
Returns: BaconResult dataclass with fields:
| Field | Type | Description |
|---|---|---|
dd_estimate |
float |
Overall TWFE DiD estimate |
se |
float |
Standard error of TWFE estimate |
two_by_two |
pd.DataFrame |
All 2×2 comparisons: treated, control, estimate, weight, type |
summary |
pd.DataFrame |
Weighted averages by comparison type: type, avg_estimate, total_weight |
n_obs |
int |
Number of observations |
n_groups |
int |
Number of timing groups |
has_always / has_never |
bool |
Whether always/never treated units are present |
within_estimate |
float |
Within-group estimate (controlled only) |
elapsed_seconds |
float |
Wall time |
bacon_plot(result, figsize=(8,5), show_dd_line=True, title=..., ax=None)
Scatter plot of 2×2 estimates vs. weights, by comparison type.
Data Requirements
- Strongly balanced panel: every unit observed at every time period.
- Binary treatment:
tr∈ {0, 1} in all periods. - Weakly increasing: once treated, units remain treated (no reversals).
- No missing values on
y,tr,unit,time, or anyxvariables.
Stata Correspondence
| Stata | Python |
|---|---|
bacondecomp y tr |
bacondecomp(df, "y", "tr", unit, time) |
bacondecomp y tr, ddetail |
bacondecomp(..., ddetail=True) |
bacondecomp y tr x1 x2 |
bacondecomp(..., x=["x1","x2"]) |
e(sumdd) |
result.summary |
stub*B, stub*S |
result.two_by_two[["estimate","weight"]] |
Validation Against Stata
The following results were produced on a synthetic staggered DiD panel (50 states × 9 years, 4 treatment cohorts: 2001/2003/2005/2007, 14 never-treated states; seed = 42) and cross-validated against Stata's bacondecomp v1.0.5.
The data and Stata do-file are available in tests/stata_verify/.
Branch 1 — no controls, no ddetail
Overall DD: Python = 0.165726 | Stata = 0.16572565
| Comparison type | Python Beta | Python Weight | Stata Beta | Stata Weight |
|---|---|---|---|---|
| Timing groups | 0.172517 | 0.506592 | 0.1725168 | 0.5065923 |
| Never vs timing | 0.158753 | 0.493408 | 0.1587530 | 0.4934077 |
Branch 2 — ddetail (no controls)
Overall DD: Python = 0.165726 | Stata = 0.16572565
All 12 timing-group 2×2 comparisons match to 6 decimal places. Summary:
| Comparison type | Python Beta | Python Weight | Stata Beta | Stata Weight |
|---|---|---|---|---|
| Early vs Late | 0.174499 | 0.204361 | 0.174499* | 0.204361* |
| Late vs Early | 0.171176 | 0.302231 | 0.171176* | 0.302231* |
| Never vs timing | 0.158753 | 0.493408 | 0.1587530 | 0.4934077 |
* Stata reports individual dyad rows; Python summary aggregates identically.
Branch 3 — controlled (FWL, x = log income + unemployment rate)
Overall DD: Python = 0.163864 | Stata = 0.163864
| Comparison type | Python Beta | Python Weight | Stata Beta | Stata Weight |
|---|---|---|---|---|
| Timing groups | 0.172833 | 0.503206 | 0.172832956 | 0.5032063 |
| Never vs timing | 0.159164 | 0.489632 | 0.1591643654 | 0.4896315 |
| Within | −0.144980 | 0.007162 | −0.1449803561 | 0.0071621 |
All three branches replicate Stata output to at least 5 significant figures.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pybacondecomp-0.1.0.tar.gz.
File metadata
- Download URL: pybacondecomp-0.1.0.tar.gz
- Upload date:
- Size: 31.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5840df4c3250e7159cba8321e98ddf70f713fa092b7ae86f5b8e61a7d505138
|
|
| MD5 |
c8cd75f8f384607fae580659a28bdce4
|
|
| BLAKE2b-256 |
f5fa80ab0412995544586b08f7e0e13e95c110549bf0e65e800720badc8d30ed
|
Provenance
The following attestation bundles were made for pybacondecomp-0.1.0.tar.gz:
Publisher:
publish.yml on luzhiyu-econ/pybacondecomp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pybacondecomp-0.1.0.tar.gz -
Subject digest:
f5840df4c3250e7159cba8321e98ddf70f713fa092b7ae86f5b8e61a7d505138 - Sigstore transparency entry: 1315515236
- Sigstore integration time:
-
Permalink:
luzhiyu-econ/pybacondecomp@b9e642a86636ef5159e74ef6c00c2ffe1210687a -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/luzhiyu-econ
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b9e642a86636ef5159e74ef6c00c2ffe1210687a -
Trigger Event:
push
-
Statement type:
File details
Details for the file pybacondecomp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pybacondecomp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e451095a322a68daa3b8e1711505a155043df6e42129f28c30564a0e0e966279
|
|
| MD5 |
db5ec239fd8828a3a1c2b3720fb8dcae
|
|
| BLAKE2b-256 |
16ba738230e68fc7fa14392bfcb417e9d22cf4a634a14b69f03eac75d37de754
|
Provenance
The following attestation bundles were made for pybacondecomp-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on luzhiyu-econ/pybacondecomp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pybacondecomp-0.1.0-py3-none-any.whl -
Subject digest:
e451095a322a68daa3b8e1711505a155043df6e42129f28c30564a0e0e966279 - Sigstore transparency entry: 1315515358
- Sigstore integration time:
-
Permalink:
luzhiyu-econ/pybacondecomp@b9e642a86636ef5159e74ef6c00c2ffe1210687a -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/luzhiyu-econ
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b9e642a86636ef5159e74ef6c00c2ffe1210687a -
Trigger Event:
push
-
Statement type: