Skip to main content

diffindiff: Python library for convenient Difference-in-Differences Analyses

Project description

diffindiff: Difference-in-Differences (DiD) Analysis Python Library

This Python library is designed for performing Difference-in-Differences (DiD) analyses in a convenient way. It allows users to construct datasets, define treatment and control groups, and set treatment periods. DiD model analyses may be conducted with both datasets created by built-in functions and ready-to-use external datasets. Both simultaneous and staggered adoption are supported. The library allows for various extensions, such as two-way fixed effects models, group- or individual-specific effects, and post-treatment periods. Additionally, it includes functions for visualizing results, such as plotting DiD coefficients with confidence intervals and illustrating the temporal evolution of staggered treatments.

Author

Thomas Wieland ORCID EMail

Features

  • Data preparation and pre-analysis:
    • Define custom treatment and control groups as well as treatment periods
    • Create ready-to-fit DiD data objects
    • Create predictive counterfactuals
  • DiD analysis:
    • Perfom standard DiD analysis
    • Model extensions:
      • Staggered adoption
      • Multiple treatments
      • Two-way fixed effects models
      • Group- or individual-specific treatment effects
      • Group- or individual-specific time trends
      • Including covariates
      • Including after-treatment period
      • Triple Difference (DDD)
      • Own counterfactuals
      • Bonferroni correction for treatment effects
      • Placebo test
  • Visualization:
    • Plot observed and expected time course of treatment and control group
    • Plot expected time course of treatment group and counterfactual
    • Plot model coefficients with confidence intervals
    • Plot individual or group-specific treatment effects with confidence intervals
    • Visualize the temporal evolution of staggered treatments
  • Diagnosis tools:
    • Test for control conditions
    • Test for type of adoption
    • Test whether the panel dataset is balanced
    • Test for parallel trend assumption

Literature

  • Baker AC, Larcker DF, Wang CCY (2022) How much should we trust staggered difference-in-differences estimates? Journal of Financial Economics 144(2): 370-395. 10.1016/j.jfineco.2022.01.004
  • Card D, Krueger AD (1994) Minimum Wages and Employment: A Case Study of the Fast Food Industry in New Jersey and Pennsylvania. The American Economic Review 84(4): 772-793. JSTOR
  • de Haas S, Götz G, Heim S (2022) Measuring the effect of COVID‑19‑related night curfews in a bundled intervention within Germany. Scientific Reports 12: 19732. 10.1038/s41598-022-24086-9
  • Goodman-Bacon A (2021) Difference-in-differences with variation in treatment timing. Journal of Econometrics 225(2): 254-277. 10.1016/j.jeconom.2021.03.014
  • Greene WH (2012) Econometric Analysis. Chapter 6.2.5.
  • Goldfarb A, Tucker C, Wang Y (2022) Conducting Research in Marketing with Quasi-Experiments. Journal of Marketing 86(3): 1-19. 10.1177/00222429221082977
  • Isporhing IE, Lipfert M, Pestel N (2021) Does re-opening schools contribute to the spread of SARS-CoV-2? Evidence from staggered summer breaks in Germany. Journal of Public Economics 198: 104426. 10.1016/j.jpubeco.2021.104426
  • Li KT, Luo L, Pattabhiramaiah A (2024) Causal Inference with Quasi-Experimental Data. IMPACT at JMR November 13, 2024. AMA
  • Olden A, Moen J (2022) The triple difference estimator. The Econometrics Journal 25(3): 531-553. 10.1093/ectj/utac010
  • Villa JM (2016) diff: Simplifying the estimation of difference-in-differences treatment effects. The Stata Journal 16(1): 52-71. 10.1177/1536867X1601600108
  • von Bismarck-Osten C, Borusyak K, Schönberg U (2022) The role of schools in transmission of the SARS-CoV-2 virus: quasi-experimental evidence from Germany. Economic Policy 37(109): 87–130. 10.1093/epolic/eiac001
  • Wieland T (2024) Assessing the effectiveness of non-pharmaceutical interventions in the SARS-CoV-2 pandemic: results of a natural experiment regarding Baden-Württemberg (Germany) and Switzerland in the second infection wave. Journal of Public Health: From Theory to Practice. 10.1007/s10389-024-02218-x
  • Wooldridge JM (2012) Introductory Econometrics. A Modern Approach. Chapter 13.2.

Examples

curfew_DE=pd.read_csv("data/curfew_DE.csv", sep=";", decimal=",")
# Test dataset: Daily and cumulative COVID-19 infections in German counties

curfew_data=create_data(
    outcome_data=curfew_DE,
    unit_id_col="county",
    time_col="infection_date",
    outcome_col="infections_cum_per100000",
    treatment_group= 
        curfew_DE.loc[curfew_DE["Bundesland"].isin([9,10,14])]["county"],
    control_group= 
        curfew_DE.loc[~curfew_DE["Bundesland"].isin([9,10,14])]["county"],
    study_period=["2020-03-01", "2020-05-15"],
    treatment_period=["2020-03-21", "2020-05-05"],
    freq="D"
    )
# Creating DiD dataset by defining groups and treatment time

curfew_data.summary()
# Summary of created treatment data

curfew_model = curfew_data.analysis()
# Model analysis of created data

curfew_model.summary()
# Model summary

curfew_model.plot(
    y_label="Cumulative infections per 100,000",
    plot_title="Curfew effectiveness - Groups over time",
    plot_observed=True
    )
# Plot observed vs. predicted (means) separated by group (treatment and control)

curfew_model.plot_effects(
    x_label="Coefficients with 95% CI",
    plot_title="Curfew effectiveness - DiD effects"
    )
# plot effects

counties_DE=pd.read_csv("data/counties_DE.csv", sep=";", decimal=",", encoding='latin1')
# Dataset with German county data

curfew_data_withgroups = curfew_data.add_covariates(
    additional_df=counties_DE, 
    unit_col="county",
    time_col=None, 
    variables=["BL"])
# Adding federal state column as covariate

curfew_model_withgroups = curfew_data_withgroups.analysis(
    GTE=True,
    group_by="BL")
# Model analysis of created data

curfew_model_withgroups.summary()
# Model summary

curfew_model_withgroups.plot_group_treatment_effects(
    treatment_group_only=True
    )
# Plot of group-specific treatment effects

See the /tests directory for usage examples of most of the included functions.

Installation

To install the package, use pip:

pip install diffindiff

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffindiff-2.0.6.tar.gz (1.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diffindiff-2.0.6-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

File details

Details for the file diffindiff-2.0.6.tar.gz.

File metadata

  • Download URL: diffindiff-2.0.6.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for diffindiff-2.0.6.tar.gz
Algorithm Hash digest
SHA256 b3b435ae7c5dcfba439707b50233ec309360b6fa4a08c4f893399a8025698326
MD5 e2c7556c710c2ffeff493f02a1e3c722
BLAKE2b-256 770ae5deb5b0f8921bf1d981632fe34d8c528fc79bae8eee62fccab28dcf409e

See more details on using hashes here.

File details

Details for the file diffindiff-2.0.6-py3-none-any.whl.

File metadata

  • Download URL: diffindiff-2.0.6-py3-none-any.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for diffindiff-2.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 1c54bc5ef3d0b38fd58c15af0989309cbf58cce2671a07844342f069ffdbf9b9
MD5 a0fceb53efb939099d3b136be7a1dc49
BLAKE2b-256 eb1b6b496c02d0d110da336f005bf8de7f47f28cd498c24dbe96a79ab3dc7dbb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page