Skip to main content

Dynamic panel GMM estimation replicating Stata's xtabond2

Project description

PyXtabond2: Dynamic Panel Data Estimation in Python

pyxtabond2 is a comprehensive Python package for estimating dynamic panel data models using the Generalized Method of Moments (GMM). It aims to faithfully replicate the functionality, matrix algebra, and robust diagnostics of Stata's highly popular xtabond2 command developed by David Roodman.

Ideal for applied econometrics and macroeconomic research, this package bridges the gap between Python's data science ecosystem and advanced dynamic panel methodologies.


🌟 Key Features

  • Difference GMM (Arellano-Bond 1991): Standard first-differenced GMM estimation.
  • System GMM (Blundell-Bond 1998): Combined level and differenced equations for highly persistent series.
  • Interactive Fixed Effects (PCA-GMM): Automatically detect and purge unobserved common factors using Bai & Ng (2002) and Ahn & Horenstein (2013) criteria.
  • Windmeijer (2005) Correction: Exact finite-sample correction for two-step GMM standard errors.
  • Forward Orthogonal Deviations (FOD): Arellano-Bover (1995) transformation, maximizing sample size in unbalanced panels with gaps.
  • Instrument Collapsing: Prevents instrument proliferation and the weakening of overidentification tests.
  • Comprehensive Diagnostics: Arellano-Bond AR(1)/AR(2) tests, Sargan/Hansen J-tests, and Difference-in-Hansen tests for instrument exogeneity.
  • Direct Export: Export publication-ready tables directly to LaTeX or Microsoft Word.

📦 Installation

pip install pyxtabond2

For export to Word/Excel and example datasets:

pip install "pyxtabond2[export]"

Development install from source:

git clone https://github.com/Kahindo048/pyxtabond2.git
cd pyxtabond2
pip install -e ".[dev,export]"

🚀 Quick Start pyxtabond2 comes with integrated example datasets so you can start experimenting immediately.

from pyxtabond2.data_utils import PanelData
from pyxtabond2.api import PyXtabond2
from pyxtabond2.load_data import load_dataset

# 1. Loading the data
# Make sure the df_panel.xlsx file is in the same folder
try:
    df = load_dataset('df_panel.xlsx')
    print(f"Data loaded successfully: {df.shape[0]} observations.")
except FileNotFoundError:
    print("Error: The file 'df_panel.xlsx' could not be found.")


# 2. Data preparation
panel = PanelData(df, id_col='Country', time_col='Year')
panel.data['L1_Growth'] = panel.get_lag('Growth', 1)
df_ready = panel.data.reset_index()

id_col = 'Country'          
time_col = 'Year'           
dep_var = 'Growth'          
x_vars = ['Capital', 'Labor', 'Wage', 'Investment', 'Ide'] 
gmm_vars = ['Growth', 'Capital'] 
iv_vars = ['Ide']           

modele = PyXtabond2(df_ready, 
                    id_col = 'Country', # Group identifier (country, firm)
                    time_col = 'Year', # Time identifier
                    dep_var = 'Growth', # Dependent variable
                    x_vars = ['L1_Growth', 'Capital', 'Labor', 'Wage', 'Investment', 'Ide'], # Explanatory variables
                    gmm_vars =['Growth', 'Capital'], # Variables for Arellano-Bond instruments
                    iv_vars = ['Ide'], # Variables for standard instruments
                    model_type='difference')

# 3. Estimation
result = modele.fit()
result.summary()

# 4. Export results for publication
result.to_latex("gmm_results.tex", full_output=False)
result.to_word("gmm_results.docx", full_output=False)

See examples/example_basique.py for a full workflow with six specifications and comparative export.

Performance

Version 0.2.0 includes vectorized NumPy/Numba implementations of the heaviest routines (panel transforms, instrument construction, GMM engine). On a standard macro panel (~2 000 obs.), a full six-model workflow is roughly 40× faster than v0.1.x.

Tests

pytest tests/

📖 References & Methodology

This package implements the algorithms and corrections outlined in the following seminal papers:

Arellano, M., & Bond, S. (1991). Some tests of specification for panel data: Monte Carlo evidence and an application to employment equations. The Review of Economic Studies.

Arellano, M., & Bover, O. (1995). Another look at the instrumental variable estimation of error-components models. Journal of Econometrics.

Blundell, R., & Bond, S. (1998). Initial conditions and moment restrictions in dynamic panel data models. Journal of Econometrics.

Windmeijer, F. (2005). A finite sample correction for the variance of linear efficient two-step GMM estimators. Journal of Econometrics.

Roodman, D. (2009). How to do xtabond2: An introduction to difference and system GMM in Stata. The Stata Journal.

Bai, J., & Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica.

Ahn, S. C., & Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica.


🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to check the issues page on the GitHub repository.

License

MIT License. See LICENSE if included in the distribution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyxtabond2-0.2.0.tar.gz (496.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyxtabond2-0.2.0-py3-none-any.whl (494.3 kB view details)

Uploaded Python 3

File details

Details for the file pyxtabond2-0.2.0.tar.gz.

File metadata

  • Download URL: pyxtabond2-0.2.0.tar.gz
  • Upload date:
  • Size: 496.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for pyxtabond2-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f01a17f791919676b7a029bae4e1a032ed349b35ce9b5089b3e677a3b3868789
MD5 bea8b5987b4f6d1a723e339672146d11
BLAKE2b-256 9905d018c2f8349800071df0a270caefb68ef7a5239547e0954765b3c6e412db

See more details on using hashes here.

File details

Details for the file pyxtabond2-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pyxtabond2-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 494.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for pyxtabond2-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bde90510ce73941f5f7b83ba017229ccd6db4ef06b88a04fb13eb65d9ba538ea
MD5 2f48ae085108dc10e2b52047a60b2a9d
BLAKE2b-256 4abb5553e87550ff09b144d2a91b23d521dc3de7ebfd5f4de21a25644bf00f4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page