Skip to main content

Panel data econometrics in Python: Fixed Effects, Random Effects, GMM (Arellano-Bond, Blundell-Bond), Experiment Pattern with Result Containers (Validation, Comparison, Residual), Test Runners & Master Reports, Interactive Visualizations (35 Charts), Professional HTML Reports, Robust Standard Errors (HC, Clustered, Driscoll-Kraay, Newey-West), Comprehensive Diagnostics

Project description

PanelBox Logo

PanelBox

Panel Data Econometrics in Python

CI codecov Ruff PyPI version Python versions License: MIT Development Status PyPI Downloads DOI Documentation


PanelBox is a comprehensive Python library for panel data econometrics, with 70+ models across 13 families, 50+ diagnostic tests, and 35+ interactive charts. It brings the capabilities of Stata's xtabond2, xtreg, xtfrontier, and R's plm, splm, frontier to Python with a modern, unified API.

Installation

pip install panelbox

Quick Start

import panelbox as pb

# Load bundled dataset (103 datasets available)
data = pb.datasets.load_grunfeld()

# Fixed Effects model
fe = pb.FixedEffects(
    formula="invest ~ value + capital",
    data=data,
    entity_col="firm",
    time_col="year"
)
results = fe.fit(cov_type='clustered')
print(results.summary())

Model Families

Static Panel Models

Model Description
PooledOLS Pooled OLS estimation
FixedEffects Within estimator (entity/time/two-way)
RandomEffects GLS estimation
BetweenEstimator Between-groups estimator
FirstDifferenceEstimator First-difference estimator

Dynamic Panel GMM

Model Description
DifferenceGMM Arellano-Bond (1991)
SystemGMM Blundell-Bond (1998)
ContinuousUpdatedGMM CUE-GMM (Hansen-Heaton-Yaron 1996)
BiasCorrectedGMM Hahn-Kuersteiner (2002) bias correction

Full diagnostic suite: Hansen J, Sargan, AR(1)/AR(2), Windmeijer correction, instrument ratio monitoring, overfit diagnostics.

Panel VAR

Model Description
PanelVAR Panel Vector Autoregression (OLS/GMM)
PanelVECM Panel Vector Error Correction Model

Includes IRF, FEVD, Granger causality network visualization, lag selection (AIC/BIC/HQIC), and Johansen cointegration rank test.

Spatial Models

Model Description
SpatialLag Spatial Autoregressive Model (SAR)
SpatialError Spatial Error Model (SEM)
SpatialDurbin Spatial Durbin Model (SDM)
GeneralNestingSpatial General Nesting Spatial (GNS)
DynamicSpatialPanel Dynamic spatial panel models

Stochastic Frontier Analysis

Model Description
StochasticFrontier SFA with half-normal, exponential, truncated-normal, gamma
FourComponentSFA Persistent/transient inefficiency decomposition

JLMS, BC, and Mode efficiency estimators. TFP decomposition and frontier visualization.

Count Data Models

Model Description
PoissonFixedEffects Conditional MLE (Hausman-Hall-Griliches 1984)
RandomEffectsPoisson RE Poisson (Gamma/Normal mixing)
NegativeBinomial NB2 for overdispersion
ZeroInflatedPoisson ZIP model
ZeroInflatedNegativeBinomial ZINB model
PPML Poisson Pseudo-ML (gravity models)

Discrete Choice Models

Model Description
FixedEffectsLogit Conditional logit (Chamberlain 1980)
RandomEffectsProbit RE probit with GHQ integration
OrderedLogit / OrderedProbit Ordered choice models
MultinomialLogit Multinomial choice (FE/RE/Pooled)

Quantile Regression

Model Description
FixedEffectsQuantile Koenker (2004) FE quantile regression
CanayTwoStep Canay (2011) two-step estimator
LocationScale MSS (2019) location-scale models
DynamicQuantile Dynamic panel quantile
QuantileTreatmentEffects Quantile treatment effects

Selection & Censored Models

Model Description
PanelHeckman Two-step Heckman (Wooldridge 1995) and MLE
PanelIV Panel IV/2SLS estimation

Diagnostic Tests (50+)

Category Tests
Unit Root LLC, IPS, Fisher, Hadri, Breitung
Cointegration Kao, Pedroni (7 stats), Westerlund (4 stats)
Specification Hausman, Mundlak, RESET, Chow, Davidson-MacKinnon J/Cox
Heteroskedasticity Breusch-Pagan, White, Modified Wald
Serial Correlation Wooldridge AR, Breusch-Godfrey, Baltagi-Wu
Cross-Sectional Dependence Pesaran CD, Frees, Breusch-Pagan LM
Spatial LM Lag/Error (standard + robust), Moran's I, Local LISA
GMM Hansen J, Sargan, AR(1)/AR(2), weak instruments
Frontier LR, Wald, skewness, Vuong, inefficiency presence

Robust Standard Errors (8 types)

  • HC0-HC3: Heteroskedasticity-consistent (White, leverage-adjusted)
  • Clustered: One-way (entity/time) and two-way (Cameron-Gelbach-Miller 2011)
  • Driscoll-Kraay: Spatial and temporal dependence
  • Newey-West: HAC for serial correlation
  • PCSE: Panel-corrected (Beck-Katz 1995)
  • Spatial HAC: For spatial panel models

Visualization (35+ interactive charts)

from panelbox.visualization import (
    create_residual_diagnostics,
    create_validation_charts,
    create_comparison_charts,
    create_panel_charts,
    export_charts,
)

# Residual diagnostics (Q-Q, fitted vs residual, scale-location, etc.)
charts = create_residual_diagnostics(results)
export_charts(charts, "diagnostics.html")

# Entity/time effects, between-within decomposition, panel structure
charts = create_panel_charts(results)

Three professional themes: professional, academic, presentation. Export to HTML, JSON, PNG, SVG, PDF.

Experiment Pattern

The PanelExperiment class provides a factory-based workflow for comparing models:

import panelbox as pb

data = pb.datasets.load_grunfeld()

experiment = pb.PanelExperiment(
    data=data,
    formula="invest ~ value + capital",
    entity_col="firm",
    time_col="year"
)

# Fit and compare models
experiment.fit_all_models(names=['pooled', 'fe', 're'])
comparison = experiment.compare_models(['pooled', 'fe', 're'])
print(f"Best model: {comparison.best_model}")

# Validate specification
validation = experiment.validate_model('fe')
validation.save_html('validation.html', test_type='validation')

# Residual diagnostics
residuals = experiment.analyze_residuals('fe')
print(residuals.summary())

# Master report linking all sub-reports
experiment.save_master_report('report.html', theme='professional', reports=[...])

AutoExperiment (AutoML for Panel Data)

AutoExperiment automates the entire panel data modeling pipeline — from variable transformation and selection to multi-model estimation, econometric validation, and ranking:

from panelbox.autoexperiment import AutoExperiment

auto = AutoExperiment(
    data=data,
    depvar="invest",
    entity_col="firm",
    time_col="year",
    sign_constraints={"value": "+", "capital": "+"},
)
results = auto.run()
print(results.summary())
results.report("report.html")
  • Variable transformations: lags, diffs, logs, growth rates, rolling means, squares
  • Forward stepwise selection by BIC/AIC with sign constraints from economic theory
  • Multi-model comparison: Pooled OLS, Fixed Effects, Random Effects, First Difference
  • Automatic validation: Hausman, Pesaran CD, RESET, Modified Wald, and more
  • Auto-selected standard errors based on diagnostic test results
  • Composite ranking combining fit, diagnostics, theory compliance, and parsimony
  • Data mining warnings when too many combinations are tested

Bundled Datasets (103 datasets)

from panelbox.datasets import load_dataset, list_datasets, list_categories

# Browse categories
print(list_categories())
# ['censored', 'count', 'diagnostics', 'discrete', 'frontier', 'gmm',
#  'marginal_effects', 'production', 'quantile', 'spatial', 'standard_errors',
#  'validation', 'var']

data = load_dataset("healthcare_visits")
grunfeld = load_dataset("grunfeld")

All 80+ example notebooks use load_dataset() and work directly in Google Colab.

Comparison with Other Packages

Feature PanelBox linearmodels pyfixest splm (R)
Static panel (FE/RE) 5 models 5 models 2 models -
Dynamic GMM 4 models - - -
Spatial models 5 models - - 4 models
Count data 9 models - Poisson -
Discrete choice 9 models - - -
Quantile regression 8 models - - -
Stochastic frontier 2 models - - -
Panel VAR/VECM 2 models - - -
Diagnostic tests 50+ ~5 ~5 ~10
Interactive charts 35+ - - -
Robust SE types 8 4 3 2
Bundled datasets 103 10 5 -

Requirements

  • Python >= 3.9
  • NumPy, Pandas, SciPy, statsmodels, scikit-learn
  • Plotly, Matplotlib, Seaborn (visualization)
  • Numba, Joblib (performance)

See pyproject.toml for full dependency list.

Documentation

Citation

@software{panelbox2026,
  author = {Haase, Gustavo and Dourado, Paulo},
  title = {PanelBox: Panel Data Econometrics in Python},
  year = {2026},
  version = {1.0.0},
  url = {https://github.com/PanelBox-Econometrics-Model/panelbox}
}

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

License

MIT License - see LICENSE.

Support


Made with care for econometricians and researchers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

panelbox-1.0.1.tar.gz (12.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

panelbox-1.0.1-py3-none-any.whl (6.6 MB view details)

Uploaded Python 3

File details

Details for the file panelbox-1.0.1.tar.gz.

File metadata

  • Download URL: panelbox-1.0.1.tar.gz
  • Upload date:
  • Size: 12.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for panelbox-1.0.1.tar.gz
Algorithm Hash digest
SHA256 6ef2b46ddb397d46c9bacdff4daadf61ad903484ab0cd4003f837a92e89e3642
MD5 1773b75f6181416f0b79daaf78872255
BLAKE2b-256 8976db4bfd5c0fb67f23f1c0797a77b02449a89792632512ce53fdc6d020eb0a

See more details on using hashes here.

File details

Details for the file panelbox-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: panelbox-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for panelbox-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a643d18bed1d3a83f75a8807407dca9ead9820ce743db1e200dad239b7011c3a
MD5 c3d7aa3ad51fb22e167eab7e66bfc391
BLAKE2b-256 22460ff5361d1f6cd243febc1c2a50fe0ce22fdf97371a547e4d8920e08b1325

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page