Panel data econometrics in Python: Fixed Effects, Random Effects, GMM (Arellano-Bond, Blundell-Bond), Experiment Pattern with Result Containers (Validation, Comparison, Residual), Test Runners & Master Reports, Interactive Visualizations (35 Charts), Professional HTML Reports, Robust Standard Errors (HC, Clustered, Driscoll-Kraay, Newey-West), Comprehensive Diagnostics
Project description
PanelBox is a comprehensive Python library for panel data econometrics, with 70+ models across 13 families, 50+ diagnostic tests, and 35+ interactive charts. It brings the capabilities of Stata's xtabond2, xtreg, xtfrontier, and R's plm, splm, frontier to Python with a modern, unified API.
Installation
pip install panelbox
Quick Start
import panelbox as pb
# Load bundled dataset (103 datasets available)
data = pb.datasets.load_grunfeld()
# Fixed Effects model
fe = pb.FixedEffects(
formula="invest ~ value + capital",
data=data,
entity_col="firm",
time_col="year"
)
results = fe.fit(cov_type='clustered')
print(results.summary())
Model Families
Static Panel Models
| Model | Description |
|---|---|
PooledOLS |
Pooled OLS estimation |
FixedEffects |
Within estimator (entity/time/two-way) |
RandomEffects |
GLS estimation |
BetweenEstimator |
Between-groups estimator |
FirstDifferenceEstimator |
First-difference estimator |
Dynamic Panel GMM
| Model | Description |
|---|---|
DifferenceGMM |
Arellano-Bond (1991) |
SystemGMM |
Blundell-Bond (1998) |
ContinuousUpdatedGMM |
CUE-GMM (Hansen-Heaton-Yaron 1996) |
BiasCorrectedGMM |
Hahn-Kuersteiner (2002) bias correction |
Full diagnostic suite: Hansen J, Sargan, AR(1)/AR(2), Windmeijer correction, instrument ratio monitoring, overfit diagnostics.
Panel VAR
| Model | Description |
|---|---|
PanelVAR |
Panel Vector Autoregression (OLS/GMM) |
PanelVECM |
Panel Vector Error Correction Model |
Includes IRF, FEVD, Granger causality network visualization, lag selection (AIC/BIC/HQIC), and Johansen cointegration rank test.
Spatial Models
| Model | Description |
|---|---|
SpatialLag |
Spatial Autoregressive Model (SAR) |
SpatialError |
Spatial Error Model (SEM) |
SpatialDurbin |
Spatial Durbin Model (SDM) |
GeneralNestingSpatial |
General Nesting Spatial (GNS) |
DynamicSpatialPanel |
Dynamic spatial panel models |
Stochastic Frontier Analysis
| Model | Description |
|---|---|
StochasticFrontier |
SFA with half-normal, exponential, truncated-normal, gamma |
FourComponentSFA |
Persistent/transient inefficiency decomposition |
JLMS, BC, and Mode efficiency estimators. TFP decomposition and frontier visualization.
Count Data Models
| Model | Description |
|---|---|
PoissonFixedEffects |
Conditional MLE (Hausman-Hall-Griliches 1984) |
RandomEffectsPoisson |
RE Poisson (Gamma/Normal mixing) |
NegativeBinomial |
NB2 for overdispersion |
ZeroInflatedPoisson |
ZIP model |
ZeroInflatedNegativeBinomial |
ZINB model |
PPML |
Poisson Pseudo-ML (gravity models) |
Discrete Choice Models
| Model | Description |
|---|---|
FixedEffectsLogit |
Conditional logit (Chamberlain 1980) |
RandomEffectsProbit |
RE probit with GHQ integration |
OrderedLogit / OrderedProbit |
Ordered choice models |
MultinomialLogit |
Multinomial choice (FE/RE/Pooled) |
Quantile Regression
| Model | Description |
|---|---|
FixedEffectsQuantile |
Koenker (2004) FE quantile regression |
CanayTwoStep |
Canay (2011) two-step estimator |
LocationScale |
MSS (2019) location-scale models |
DynamicQuantile |
Dynamic panel quantile |
QuantileTreatmentEffects |
Quantile treatment effects |
Selection & Censored Models
| Model | Description |
|---|---|
PanelHeckman |
Two-step Heckman (Wooldridge 1995) and MLE |
PanelIV |
Panel IV/2SLS estimation |
Diagnostic Tests (50+)
| Category | Tests |
|---|---|
| Unit Root | LLC, IPS, Fisher, Hadri, Breitung |
| Cointegration | Kao, Pedroni (7 stats), Westerlund (4 stats) |
| Specification | Hausman, Mundlak, RESET, Chow, Davidson-MacKinnon J/Cox |
| Heteroskedasticity | Breusch-Pagan, White, Modified Wald |
| Serial Correlation | Wooldridge AR, Breusch-Godfrey, Baltagi-Wu |
| Cross-Sectional Dependence | Pesaran CD, Frees, Breusch-Pagan LM |
| Spatial | LM Lag/Error (standard + robust), Moran's I, Local LISA |
| GMM | Hansen J, Sargan, AR(1)/AR(2), weak instruments |
| Frontier | LR, Wald, skewness, Vuong, inefficiency presence |
Robust Standard Errors (8 types)
- HC0-HC3: Heteroskedasticity-consistent (White, leverage-adjusted)
- Clustered: One-way (entity/time) and two-way (Cameron-Gelbach-Miller 2011)
- Driscoll-Kraay: Spatial and temporal dependence
- Newey-West: HAC for serial correlation
- PCSE: Panel-corrected (Beck-Katz 1995)
- Spatial HAC: For spatial panel models
Visualization (35+ interactive charts)
from panelbox.visualization import (
create_residual_diagnostics,
create_validation_charts,
create_comparison_charts,
create_panel_charts,
export_charts,
)
# Residual diagnostics (Q-Q, fitted vs residual, scale-location, etc.)
charts = create_residual_diagnostics(results)
export_charts(charts, "diagnostics.html")
# Entity/time effects, between-within decomposition, panel structure
charts = create_panel_charts(results)
Three professional themes: professional, academic, presentation. Export to HTML, JSON, PNG, SVG, PDF.
Experiment Pattern
The PanelExperiment class provides a factory-based workflow for comparing models:
import panelbox as pb
data = pb.datasets.load_grunfeld()
experiment = pb.PanelExperiment(
data=data,
formula="invest ~ value + capital",
entity_col="firm",
time_col="year"
)
# Fit and compare models
experiment.fit_all_models(names=['pooled', 'fe', 're'])
comparison = experiment.compare_models(['pooled', 'fe', 're'])
print(f"Best model: {comparison.best_model}")
# Validate specification
validation = experiment.validate_model('fe')
validation.save_html('validation.html', test_type='validation')
# Residual diagnostics
residuals = experiment.analyze_residuals('fe')
print(residuals.summary())
# Master report linking all sub-reports
experiment.save_master_report('report.html', theme='professional', reports=[...])
AutoExperiment (AutoML for Panel Data)
AutoExperiment automates the entire panel data modeling pipeline — from variable transformation and selection to multi-model estimation, econometric validation, and ranking:
from panelbox.autoexperiment import AutoExperiment
auto = AutoExperiment(
data=data,
depvar="invest",
entity_col="firm",
time_col="year",
sign_constraints={"value": "+", "capital": "+"},
)
results = auto.run()
print(results.summary())
results.report("report.html")
- Variable transformations: lags, diffs, logs, growth rates, rolling means, squares
- Forward stepwise selection by BIC/AIC with sign constraints from economic theory
- Multi-model comparison: Pooled OLS, Fixed Effects, Random Effects, First Difference
- Automatic validation: Hausman, Pesaran CD, RESET, Modified Wald, and more
- Auto-selected standard errors based on diagnostic test results
- Composite ranking combining fit, diagnostics, theory compliance, and parsimony
- Data mining warnings when too many combinations are tested
Bundled Datasets (103 datasets)
from panelbox.datasets import load_dataset, list_datasets, list_categories
# Browse categories
print(list_categories())
# ['censored', 'count', 'diagnostics', 'discrete', 'frontier', 'gmm',
# 'marginal_effects', 'production', 'quantile', 'spatial', 'standard_errors',
# 'validation', 'var']
data = load_dataset("healthcare_visits")
grunfeld = load_dataset("grunfeld")
All 80+ example notebooks use load_dataset() and work directly in Google Colab.
Comparison with Other Packages
| Feature | PanelBox | linearmodels | pyfixest | splm (R) |
|---|---|---|---|---|
| Static panel (FE/RE) | 5 models | 5 models | 2 models | - |
| Dynamic GMM | 4 models | - | - | - |
| Spatial models | 5 models | - | - | 4 models |
| Count data | 9 models | - | Poisson | - |
| Discrete choice | 9 models | - | - | - |
| Quantile regression | 8 models | - | - | - |
| Stochastic frontier | 2 models | - | - | - |
| Panel VAR/VECM | 2 models | - | - | - |
| Diagnostic tests | 50+ | ~5 | ~5 | ~10 |
| Interactive charts | 35+ | - | - | - |
| Robust SE types | 8 | 4 | 3 | 2 |
| Bundled datasets | 103 | 10 | 5 | - |
Requirements
- Python >= 3.9
- NumPy, Pandas, SciPy, statsmodels, scikit-learn
- Plotly, Matplotlib, Seaborn (visualization)
- Numba, Joblib (performance)
See pyproject.toml for full dependency list.
Documentation
- User Guide - Comprehensive guides for all model families
- API Reference - Full API documentation
- Tutorials - Interactive Jupyter notebooks
- Examples - 80+ example notebooks across all model families
- Theory - Econometric theory guides
- Benchmarks - Validation against R/Stata
Citation
@software{panelbox2026,
author = {Haase, Gustavo and Dourado, Paulo},
title = {PanelBox: Panel Data Econometrics in Python},
year = {2026},
version = {1.0.0},
url = {https://github.com/PanelBox-Econometrics-Model/panelbox}
}
Contributing
Contributions are welcome! See CONTRIBUTING.md for guidelines.
License
MIT License - see LICENSE.
Support
Made with care for econometricians and researchers
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file panelbox-1.0.1.tar.gz.
File metadata
- Download URL: panelbox-1.0.1.tar.gz
- Upload date:
- Size: 12.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ef2b46ddb397d46c9bacdff4daadf61ad903484ab0cd4003f837a92e89e3642
|
|
| MD5 |
1773b75f6181416f0b79daaf78872255
|
|
| BLAKE2b-256 |
8976db4bfd5c0fb67f23f1c0797a77b02449a89792632512ce53fdc6d020eb0a
|
File details
Details for the file panelbox-1.0.1-py3-none-any.whl.
File metadata
- Download URL: panelbox-1.0.1-py3-none-any.whl
- Upload date:
- Size: 6.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a643d18bed1d3a83f75a8807407dca9ead9820ce743db1e200dad239b7011c3a
|
|
| MD5 |
c3d7aa3ad51fb22e167eab7e66bfc391
|
|
| BLAKE2b-256 |
22460ff5361d1f6cd243febc1c2a50fe0ce22fdf97371a547e4d8920e08b1325
|