A Python implementation of Stata's outreg2 for exporting regression results
Project description
PyOutreg
A Python implementation of Stata's popular outreg2 command for exporting regression results to Excel and Word formats with publication-quality formatting.
Features
- Regression Export: Export results from
statsmodelsandlinearmodelsto Excel (.xlsx) and Word (.docx) - Model Support: OLS, Fixed Effects, Random Effects, Logit, Probit, IV, Panel Data
- Professional Formatting: Publication-ready tables with significance stars, standard errors
- Model Comparison: Side-by-side comparison of multiple models in single tables
- Customization: Extensive options for decimal places, variable selection, titles, notes
- Summary Statistics: Descriptive statistics and cross-tabulation export
- Ecosystem Integration: Part of the PyStataR ecosystem for comprehensive Stata-like functionality in Python
- Future-Ready: Designed for seamless integration with pdtab, StasPAI, and other statistical tools
Installation
pip install pyoutreg
Related Packages
PyOutreg is part of a growing ecosystem of Python packages that bring Stata-like functionality to Python:
PyStataR
The PyOutreg library will be integrated into PyStataR, a comprehensive Python package that bridges Stata and R functionality in Python. PyStataR aims to provide Stata users with familiar commands and workflows while leveraging Python's powerful data science ecosystem.
StasPAI
For users interested in AI-powered econometric analysis, StasPAI offers a related project focused on integrating statistical analysis with artificial intelligence methods. StasPAI provides advanced econometric modeling capabilities enhanced by machine learning approaches.
Integration with Broader Ecosystem
PyOutreg is part of a comprehensive econometric and statistical analysis ecosystem:
PyStataR
The PyOutreg library will be integrated into PyStataR, a comprehensive Python package that bridges Stata and R functionality in Python. PyStataR aims to provide Stata users with familiar commands and workflows while leveraging Python's powerful data science ecosystem.
Key Integration Features:
- Unified Command Interface: PyOutreg's
outreg()function will be accessible asps.outreg()within PyStataR - Seamless Workflow: Direct integration with PyStataR's regression commands and data manipulation functions
- Consistent Syntax: Stata-like command structure for familiar user experience
- Enhanced Functionality: Combined with other statistical tools for comprehensive analysis
StasPAI
For users interested in AI-powered econometric analysis, StasPAI offers a related project focused on integrating statistical analysis with artificial intelligence methods. StasPAI provides advanced econometric modeling capabilities enhanced by machine learning approaches.
Ecosystem Components:
- PyStataR - Main package integrating PyOutreg and other Stata-like tools
- pdtab - Pandas-based tabulation library for cross-tabulation and summary statistics
- StasPAI - AI-powered econometric analysis and machine learning integration
- PyOutreg - Regression table export functionality (this package)
Future Integration Examples
# Future PyStataR integration
import PyStataR as ps
# Regression analysis with immediate export
ps.regress('wage education experience age', data)
ps.outreg('regression_results.xlsx', title="Wage Analysis")
# Combined workflow
ps.summarize(data)
ps.tabulate('gender region', data)
ps.outreg_compare([model1, model2], 'comparison.xlsx')
Quick Start
Basic Regression Export
import pandas as pd
import statsmodels.api as sm
from pyoutreg import outreg
# Load data and run regression
data = pd.read_csv('your_data.csv')
y = data['wage']
X = sm.add_constant(data[['education', 'experience', 'age']])
result = sm.OLS(y, X).fit()
# Export to Excel with professional formatting
outreg(result, 'regression_results.xlsx',
title="Wage Regression Analysis",
ctitle="OLS Model",
replace=True)
# Export to Word with custom notes
outreg(result, 'regression_results.docx',
title="Wage Regression Analysis",
addnote="Standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1",
replace=True)
Multiple Model Comparison
from pyoutreg import outreg_compare
# Fit multiple models
X1 = sm.add_constant(data[['education']])
model1 = sm.OLS(y, X1).fit()
X2 = sm.add_constant(data[['education', 'experience']])
model2 = sm.OLS(y, X2).fit()
X3 = sm.add_constant(data[['education', 'experience', 'age']])
model3 = sm.OLS(y, X3).fit()
# Compare models side-by-side
outreg_compare(
[model1, model2, model3],
'model_comparison.xlsx',
model_names=['Basic', 'Add Experience', 'Full Model'],
title='Progressive Model Specification',
replace=True
)
Panel Data Analysis
import linearmodels.panel as lmp
# Prepare panel data
panel_data = data.set_index(['individual_id', 'year'])
# Fixed Effects Model
dependent = panel_data['wage']
exogenous = panel_data[['education', 'experience']]
fe_model = lmp.PanelOLS(dependent, exogenous, entity_effects=True)
fe_result = fe_model.fit(cov_type='clustered', cluster_entity=True)
# Random Effects Model
re_model = lmp.RandomEffects(dependent, exogenous)
re_result = re_model.fit()
# Compare panel models
outreg_compare(
[fe_result, re_result],
'panel_comparison.xlsx',
model_names=['Fixed Effects', 'Random Effects'],
title='Panel Data Model Comparison',
replace=True
)
Logistic Regression with Odds Ratios
# Binary outcome regression
y_binary = data['employed'] # 1=employed, 0=unemployed
X_logit = sm.add_constant(data[['education', 'experience', 'age']])
logit_model = sm.Logit(y_binary, X_logit)
logit_result = logit_model.fit()
# Export coefficients
outreg(logit_result, 'logit_coefficients.xlsx',
title="Employment Probability Analysis",
ctitle="Coefficients",
replace=True)
# Export odds ratios
outreg(logit_result, 'logit_odds_ratios.xlsx',
title="Employment Probability Analysis",
ctitle="Odds Ratios",
eform=True, # Convert to odds ratios
replace=True)
Summary Statistics
from pyoutreg import summary_stats
# Basic descriptive statistics
summary_stats(
data,
'summary_stats.xlsx',
variables=['wage', 'education', 'experience', 'age'],
title="Descriptive Statistics",
replace=True
)
# Grouped statistics
summary_stats(
data,
'grouped_stats.xlsx',
variables=['wage', 'education'],
by='gender', # Group by gender
title="Statistics by Gender",
replace=True
)
# Detailed statistics with percentiles
summary_stats(
data,
'detailed_stats.xlsx',
variables=['wage', 'education'],
detail=True, # Include percentiles, skewness, kurtosis
title="Detailed Descriptive Statistics",
replace=True
)
Cross-tabulation
from pyoutreg import cross_tab
# Cross-tabulation with counts and percentages
cross_tab(
data,
'gender', # Row variable
'region', # Column variable
'crosstab_gender_region.xlsx',
title="Gender by Region Cross-tabulation",
replace=True
)
Advanced Customization
# Extensive customization options
outreg(result, 'customized_output.xlsx',
replace=True,
title="Wage Regression with Custom Formatting",
ctitle="Full Model",
# Decimal control
dec=3, # Overall decimal places
bdec=4, # Coefficient decimal places
sdec=5, # Standard error decimal places
# Variable selection
keep=['education', 'experience'], # Only show these variables
# drop=['age'], # Alternative: drop specific variables
# Additional statistics
addstat={
'Mean Wage': data['wage'].mean(),
'Sample Size': len(data),
'Data Period': '2010-2020'
},
# Notes and formatting
addnote="Robust standard errors. Data from national survey.",
font_size=12
)
API Reference
Main Functions
outreg(model_result, filename, **options)
Export single regression model to Excel or Word.
Parameters:
model_result: Fitted regression model (statsmodels or linearmodels)filename: Output filename (.xlsx or .docx) or None for previewctitle: Column title for the modeltitle: Table titlereplace: Replace existing file (default: False)append: Append to existing file (default: False)dec/bdec/sdec: Decimal places for overall/coefficients/standard errorskeep/drop: Variable selectionaddstat: Dictionary of additional statisticsaddnote: Custom noteseform: Export odds ratios for logistic regression
outreg_compare(models_list, filename, **options)
Compare multiple models side-by-side.
Parameters:
models_list: List of fitted regression modelsfilename: Output filename or None for previewmodel_names: List of model namestitle: Table title- Other options same as
outreg
summary_stats(data, filename, **options)
Export descriptive statistics.
Parameters:
data: pandas DataFramefilename: Output filename or None for previewvariables: List of variables to includeby: Grouping variabledetail: Include percentiles and distribution statistics
cross_tab(data, row_var, col_var, filename, **options)
Export cross-tabulation table.
Parameters:
data: pandas DataFramerow_var: Row variable namecol_var: Column variable namefilename: Output filename or None for preview
Output Examples
Regression Table Output
Variable Model 1 Model 2 Model 3
education 482.135*** 462.891*** 458.023***
(24.726) (25.018) (25.134)
experience 301.274*** 287.345***
(18.642) (19.123)
age 156.789***
(12.456)
Constant 15234.567*** 12845.321*** 11234.789***
(387.234) (425.178) (456.234)
Observations 1,000 1,000 1,000
R-squared 0.234 0.287 0.312
F-statistic 152.34 189.45 167.23
*** p<0.01, ** p<0.05, * p<0.1
Summary Statistics Output
Variable Obs Mean Std. Dev. Min Max
wage 1,000 45,234.56 12,456.78 15,000 120,000
education 1,000 15.8 2.4 8 25
experience 1,000 12.3 8.9 0 40
age 1,000 35.2 10.1 18 65
Integration with PyStataR
PyOutreg is designed to be integrated into the PyStataR package, which aims to provide comprehensive Stata-like functionality in Python. As part of the broader econometric ecosystem, PyOutreg will work seamlessly with other statistical tools:
# Future integration (planned)
import PyStataR as ps
# Direct regression analysis and export
ps.regress('wage education experience age', data)
ps.outreg('wage_analysis.xlsx', title="Wage Regression Results")
# Summary statistics and cross-tabulation
ps.summarize(data, by='gender')
ps.tabulate('education region', data)
# Advanced model comparison workflow
model1 = ps.regress('wage education', data)
model2 = ps.regress('wage education experience', data)
model3 = ps.regress('wage education experience age', data)
ps.outreg_compare([model1, model2, model3],
'progressive_models.xlsx',
model_names=['Basic', 'Add Experience', 'Full Model'])
# Integration with other ecosystem tools
ps.pdtab.crosstab(data, 'gender', 'region') # pdtab integration
ps.summary_stats(data, detail=True) # PyOutreg functionality
Integrated Ecosystem Benefits:
- Unified Interface: Single import for all Stata-like functionality
- Seamless Workflow: No need to switch between different packages
- Consistent Documentation: Integrated help system and examples
- Enhanced Performance: Optimized integration between components
Related Projects in the Ecosystem:
- PyStataR: Main integration package providing Stata-like functionality
- pdtab: Pandas-based tabulation library for statistical summaries
- StasPAI: AI-powered econometric analysis with machine learning integration
- PyOutreg: Regression table export (this package)
Documentation
For comprehensive documentation and more examples:
- Tutorial: See
tutorial.ipynbfor a complete walkthrough - Examples: Check the
examples/directory for specific use cases - API Reference: Detailed function documentation
- Tests:
tests/directory contains validation examples
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
📄 License
MIT License
Copyright (c) 2025 Bryce Wang
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyoutreg-0.1.0.tar.gz.
File metadata
- Download URL: pyoutreg-0.1.0.tar.gz
- Upload date:
- Size: 30.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b6898b2c30c69745a39cb4d62f336b94cc4692f4aa9a431cc43160aca4a7e8ac
|
|
| MD5 |
35d74fdd6ba9e75f2c0d613f82bc2b32
|
|
| BLAKE2b-256 |
0783f44d7d353f94bf6d544ccc091c8e57d09055b3ef5fa00fd01ff19ec41b7b
|
File details
Details for the file pyoutreg-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pyoutreg-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e12e29167694064ccfb12bd87a8482413625ffe9ee1b717a4ea5c63a70cb461
|
|
| MD5 |
015c00fb62bee3bd1cde9b976e4ba736
|
|
| BLAKE2b-256 |
3307277df981590ddb00a2ad7ad71f09971c54c5db09afca8b4c76a8d43a733d
|