Skip to main content

Python replication of Stata's bacondecomp command — Goodman-Bacon (2021) decomposition of TWFE DiD estimators

Project description

bacondecomp

Bacon Decomposition of Two-Way Fixed Effects Difference-in-Differences

A Python implementation of the Goodman-Bacon (2021) decomposition, which expresses any two-way fixed effects (TWFE) DiD estimator as a weighted average of all possible 2×2 DiD comparisons. Supports uncontrolled, ddetail, and controlled (FWL) decompositions with optional multi-core parallelism via joblib.


Installation

pip install pybacondecomp

Dependencies: numpy, pandas, pyfixest ≥ 0.25
Optional: joblib (parallel execution), tqdm (progress bars), matplotlib (plots)


Background

In staggered adoption designs, the TWFE estimator is a weighted average of all 2×2 DiD comparisons between pairs of timing groups. Some of these comparisons use already-treated units as the "control" group, which can produce negative weights when treatment effects are heterogeneous across groups or over time.

The decomposition identifies three types of comparisons:

Type Description
Timing groups Earlier-adopting group vs. later-adopting group (and vice versa)
Never vs. timing Timing group vs. never-treated units
Always vs. timing Timing group vs. always-treated units
Within Within-group variation (controlled decomposition only)

ddetail mode further splits timing-group comparisons into:

  • Early vs. Late — earlier-adopting group treated, later-adopting group as not-yet-treated control
  • Late vs. Early — later-adopting group treated, earlier-adopting group as already-treated control

Citation

This package is a Python port of the Stata command bacondecomp (v1.0.5, Goodman-Bacon, Goldring & Nichols, 2022). Please cite the original paper when using this package:

Goodman-Bacon, Andrew. "Difference-in-differences with variation in treatment timing." Journal of Econometrics 225, no. 2 (2021): 254–277. https://doi.org/10.1016/j.jeconom.2021.03.014

The original working paper version:

Goodman-Bacon, Andrew. "Difference-in-differences with variation in treatment timing." NBER Working Paper No. 25018, 2018. https://www.nber.org/papers/w25018

BibTeX:

@article{goodman-bacon2021,
  author  = {Goodman-Bacon, Andrew},
  title   = {Difference-in-differences with variation in treatment timing},
  journal = {Journal of Econometrics},
  volume  = {225},
  number  = {2},
  pages   = {254--277},
  year    = {2021},
  doi     = {10.1016/j.jeconom.2021.03.014}
}

The Stata implementation this port is based on:

Goodman-Bacon, Andrew, Thomas Goldring, and Austin Nichols. bacondecomp: Stata module to perform Bacon decomposition of difference-in-differences estimation. Statistical Software Components S458676, Boston College Department of Economics, 2022. https://ideas.repec.org/c/boc/bocode/s458676.html


Usage

Basic (no controls)

import pandas as pd
from pybacondecomp import bacondecomp

result = bacondecomp(
    df,
    y     = "outcome",      # outcome variable
    tr    = "treat",        # binary treatment (0/1, weakly increasing)
    unit  = "state",        # panel unit identifier
    time  = "year",         # time variable
)

print(result.dd_estimate)   # overall TWFE estimate
print(result.summary)       # weighted average by comparison type
print(result.two_by_two)    # every 2×2 DiD comparison

ddetail mode — split Early vs. Late

from pybacondecomp import bacondecomp

result = bacondecomp(df, y="outcome", tr="treat",
                     unit="state", time="year",
                     ddetail=True)

Controlled decomposition (FWL)

from pybacondecomp import bacondecomp

result = bacondecomp(df, y="outcome", tr="treat",
                     unit="state", time="year",
                     x=["log_income", "unemp_rate"])

Parallel execution

from pybacondecomp import bacondecomp

result = bacondecomp(df, y="outcome", tr="treat",
                     unit="state", time="year",
                     n_jobs=-1)   # use all cores

Plot

from pybacondecomp import bacon_plot

fig = bacon_plot(result)
fig.savefig("bacon.png", dpi=150)

Stata-style interface

from pybacondecomp import bacondecomp_stata

result = bacondecomp_stata(df, "outcome treat log_income unemp_rate",
                           unit="state", time="year")

API Reference

bacondecomp(df, y, tr, unit, time, x=None, weights=None, ddetail=False, n_jobs=1, verbose=True)

Parameter Type Default Description
df pd.DataFrame Strongly balanced panel
y str Outcome variable
tr str Binary treatment (0/1, weakly increasing)
unit str Panel unit identifier
time str Time variable
x list[str] None Control variables (triggers FWL decomposition)
weights str None Analytic weight variable
ddetail bool False Split timing-group comparisons into Early/Late
n_jobs int 1 Parallel workers (-1 = all cores); requires joblib
verbose bool True Print progress and summary

Returns: BaconResult dataclass with fields:

Field Type Description
dd_estimate float Overall TWFE DiD estimate
se float Standard error of TWFE estimate
two_by_two pd.DataFrame All 2×2 comparisons: treated, control, estimate, weight, type
summary pd.DataFrame Weighted averages by comparison type: type, avg_estimate, total_weight
n_obs int Number of observations
n_groups int Number of timing groups
has_always / has_never bool Whether always/never treated units are present
within_estimate float Within-group estimate (controlled only)
elapsed_seconds float Wall time

bacon_plot(result, figsize=(8,5), show_dd_line=True, title=..., ax=None)

Scatter plot of 2×2 estimates vs. weights, by comparison type.


Data Requirements

  • Strongly balanced panel: every unit observed at every time period.
  • Binary treatment: tr ∈ {0, 1} in all periods.
  • Weakly increasing: once treated, units remain treated (no reversals).
  • No missing values on y, tr, unit, time, or any x variables.

Stata Correspondence

Stata Python
bacondecomp y tr bacondecomp(df, "y", "tr", unit, time)
bacondecomp y tr, ddetail bacondecomp(..., ddetail=True)
bacondecomp y tr x1 x2 bacondecomp(..., x=["x1","x2"])
e(sumdd) result.summary
stub*B, stub*S result.two_by_two[["estimate","weight"]]

Validation Against Stata

The following results were produced on a synthetic staggered DiD panel (50 states × 9 years, 4 treatment cohorts: 2001/2003/2005/2007, 14 never-treated states; seed = 42) and cross-validated against Stata's bacondecomp v1.0.5.

The data and Stata do-file are available in tests/stata_verify/.

Branch 1 — no controls, no ddetail

Overall DD: Python = 0.165726 | Stata = 0.16572565

Comparison type Python Beta Python Weight Stata Beta Stata Weight
Timing groups 0.172517 0.506592 0.1725168 0.5065923
Never vs timing 0.158753 0.493408 0.1587530 0.4934077

Branch 2 — ddetail (no controls)

Overall DD: Python = 0.165726 | Stata = 0.16572565

All 12 timing-group 2×2 comparisons match to 6 decimal places. Summary:

Comparison type Python Beta Python Weight Stata Beta Stata Weight
Early vs Late 0.174499 0.204361 0.174499* 0.204361*
Late vs Early 0.171176 0.302231 0.171176* 0.302231*
Never vs timing 0.158753 0.493408 0.1587530 0.4934077

* Stata reports individual dyad rows; Python summary aggregates identically.

Branch 3 — controlled (FWL, x = log income + unemployment rate)

Overall DD: Python = 0.163864 | Stata = 0.163864

Comparison type Python Beta Python Weight Stata Beta Stata Weight
Timing groups 0.172833 0.503206 0.172832956 0.5032063
Never vs timing 0.159164 0.489632 0.1591643654 0.4896315
Within −0.144980 0.007162 −0.1449803561 0.0071621

All three branches replicate Stata output to at least 5 significant figures.


License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybacondecomp-0.1.0.tar.gz (31.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pybacondecomp-0.1.0-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file pybacondecomp-0.1.0.tar.gz.

File metadata

  • Download URL: pybacondecomp-0.1.0.tar.gz
  • Upload date:
  • Size: 31.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pybacondecomp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f5840df4c3250e7159cba8321e98ddf70f713fa092b7ae86f5b8e61a7d505138
MD5 c8cd75f8f384607fae580659a28bdce4
BLAKE2b-256 f5fa80ab0412995544586b08f7e0e13e95c110549bf0e65e800720badc8d30ed

See more details on using hashes here.

Provenance

The following attestation bundles were made for pybacondecomp-0.1.0.tar.gz:

Publisher: publish.yml on luzhiyu-econ/pybacondecomp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pybacondecomp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pybacondecomp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pybacondecomp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e451095a322a68daa3b8e1711505a155043df6e42129f28c30564a0e0e966279
MD5 db5ec239fd8828a3a1c2b3720fb8dcae
BLAKE2b-256 16ba738230e68fc7fa14392bfcb417e9d22cf4a634a14b69f03eac75d37de754

See more details on using hashes here.

Provenance

The following attestation bundles were made for pybacondecomp-0.1.0-py3-none-any.whl:

Publisher: publish.yml on luzhiyu-econ/pybacondecomp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page