Stata2Python: A Python econometrics toolkit aligned with Stata 17
Project description
Statapy
Statapy is a Python library that mirrors a focused subset of common Stata econometrics workflows with a Stata-like command surface and source-backed validation.
This clean open-source package is separated from the original development workspace. It keeps the core library, public examples, validation evidence, public datasets, and release-facing documentation, while excluding most internal planning, review, and task-tracking material.
What is included
- Core package code in src/statapy
- Public examples in examples
- Command support documentation in docs/command-support-matrix
- Validation evidence in docs/validation
- Public datasets and validation artifacts in research/data/public and research/results/validation
- A concise user manual in docs/USER_GUIDE.md
Chinese documentation:
Installation
pip install StataFlow
Python 3.10+ required. Dependencies: NumPy, pandas, SciPy, PyYAML.
For development (editable install from source):
git clone https://github.com/ZhenHaoFu810/StataFlow.git
cd StataFlow
pip install -e .
Quick start
import pandas as pd
from statapy.compat.stata import regress, reghdfe
df = pd.read_csv("research/data/public/panel/oos/airfare.csv")
ols_res = regress(
df=df,
y="lfare",
x=["ldist", "y98", "y99"],
vce="robust",
)
hdfe_res = reghdfe(
df=df,
y="lfare",
x=["ldist", "y98##y99"],
absorb="id year",
vce="cluster",
cluster="id",
)
The compat.stata wrappers return stable result schemas for command-style usage. Lower-level estimators remain available in the core package for programmatic workflows.
Supported command families
Current coverage focuses on validated subsets of:
regressxtreg, fearegreghdfeivregress 2slsivreghdfelogitprobitpoissonppmlhdfedid_imputationeventstudyinteractcsdidrdrobust
Support is command-specific and subset-specific. Do not assume full Stata parity from command name alone. Check the command matrices:
Validation evidence
The main public evidence entry points are:
The validation policy is strict field-level comparison against Stata 17 for the implemented subset. Synthetic development tests exist in the codebase, but the public evidence book emphasizes real public-data dual runs.
Repository structure
- src: package source
- examples: runnable examples
- docs: user-facing documentation
- scripts/validation: validation runners and summary builders
- research/data/public: public datasets used for validation
- research/results/validation: generated validation artifacts
Known limitations
- Several community commands are validated subsets rather than full command reimplementations.
- Validation evidence is strongest for the documented subset and the included public datasets.
- Some internal development reports were intentionally excluded from this clean package.
For current release notes and known issues:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stataflow-0.1.2.tar.gz.
File metadata
- Download URL: stataflow-0.1.2.tar.gz
- Upload date:
- Size: 75.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7ad0746ec9a1a067104818cc65a5034e6f54513a308a70dcb92ffbbd1981d8e
|
|
| MD5 |
640c20b5b4b8b2fab2c1db3fd9019845
|
|
| BLAKE2b-256 |
16241c7ad3f0492cf5ddef47150f24600a65af2ce65dc89414cb1e019c0b3c12
|
File details
Details for the file stataflow-0.1.2-py3-none-any.whl.
File metadata
- Download URL: stataflow-0.1.2-py3-none-any.whl
- Upload date:
- Size: 77.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2968fe5e669b6b3fb892195a41cadbbb5e40a5bf1ec68f09c2f1ed9e17869dfb
|
|
| MD5 |
770f5ef51569761bf79c57cd5f40c86c
|
|
| BLAKE2b-256 |
88c7c1742e944209cf4c3cb918da28074eb4e5b68e0b9b664ee0f6fcfd5123be
|