Skip to main content

StataFlow: A Python econometrics toolkit aligned with Stata 17

Project description

StataFlow

StataFlow is a Python library that mirrors a focused subset of common Stata econometrics workflows with a Stata-like command surface and source-backed validation.

This clean open-source package is separated from the original development workspace. It keeps the core library, public examples, validation evidence, public datasets, and release-facing documentation, while excluding most internal planning, review, and task-tracking material.

What is included

Chinese documentation:

Installation

pip install StataFlow

Python 3.10+ required. Dependencies: NumPy, pandas, SciPy, PyYAML.

For development (editable install from source):

git clone https://github.com/ZhenHaoFu810/StataFlow.git
cd StataFlow
pip install -e .

Quick start

import pandas as pd
from stataflow.compat.stata import regress, reghdfe

df = pd.read_csv("research/data/public/panel/oos/airfare.csv")

ols_res = regress(
    df=df,
    y="lfare",
    x=["ldist", "y98", "y99"],
    vce="robust",
)

hdfe_res = reghdfe(
    df=df,
    y="lfare",
    x=["ldist", "y98##y99"],
    absorb="id year",
    vce="cluster",
    cluster="id",
)

The compat.stata wrappers return stable result schemas for command-style usage. Lower-level estimators remain available in the core package for programmatic workflows.

Supported command families

Current coverage focuses on validated subsets of:

  • regress
  • xtreg, fe
  • areg
  • reghdfe
  • ivregress 2sls
  • ivreghdfe
  • logit
  • probit
  • poisson
  • ppmlhdfe
  • did_imputation
  • eventstudyinteract
  • csdid
  • rdrobust

Support is command-specific and subset-specific. Do not assume full Stata parity from command name alone. Check the command matrices:

Validation evidence

The main public evidence entry points are:

The validation policy is strict field-level comparison against Stata 17 for the implemented subset. Synthetic development tests exist in the codebase, but the public evidence book emphasizes real public-data dual runs.

Repository structure

Known limitations

  • Several community commands are validated subsets rather than full command reimplementations.
  • Validation evidence is strongest for the documented subset and the included public datasets.
  • Some internal development reports were intentionally excluded from this clean package.

For current release notes and known issues:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stataflow-0.1.3.tar.gz (74.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stataflow-0.1.3-py3-none-any.whl (78.0 kB view details)

Uploaded Python 3

File details

Details for the file stataflow-0.1.3.tar.gz.

File metadata

  • Download URL: stataflow-0.1.3.tar.gz
  • Upload date:
  • Size: 74.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for stataflow-0.1.3.tar.gz
Algorithm Hash digest
SHA256 923ec1e0869d31aa14398c2311aeeeb3bb08dd2931ebc6c8d9071bc2dddd2ce8
MD5 e711feef4c7f864e350ab60723a35138
BLAKE2b-256 872528085412c586dbaf3abbe9eaa75e271a5c33edf6c3ae7e37a7557a6fecfe

See more details on using hashes here.

File details

Details for the file stataflow-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: stataflow-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 78.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for stataflow-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 410ecba11759f215555fdf379e6d042aaa8d7d4c00a7d8acf0e4e9a6f0092e9b
MD5 3711a4dd1453a0bd1ecf79431c717877
BLAKE2b-256 058ce647493af52c56ce97e4067d56092f3d7987140db054e853d3ea984bb448

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page