Skip to main content

Python implementation of Stata's egen command for pandas DataFrames

Project description

PyEgen

PyPI version Python 3.7+ License: MIT Downloads

Python implementation of Stata's egen command for pandas DataFrames. This package provides Stata-style data manipulation functions, making it easier for researchers to transition from Stata to Python while maintaining familiar syntax and functionality.

Quick Start

pip install pyegen
import pandas as pd
import pyegen as egen

# Create sample data
df = pd.DataFrame({
    'group': ['A', 'A', 'B', 'B', 'C', 'C'],
    'value': [10, 20, 30, 40, 50, 60]
})

# Generate ranks
df['rank'] = egen.rank(df['value'])

# Calculate group means
df['group_mean'] = egen.mean(df['value'], by=df['group'])

# Row-wise operations
df['row_sum'] = egen.rowtotal(df, ['value'])

Available Functions

PyEgen supports 40+ functions covering all major Stata egen capabilities:

Row-wise Functions

  • rowmean(), rowtotal(), rowmax(), rowmin(), rowsd()
  • rowfirst(), rowlast(), rowmedian(), rowmiss(), rownonmiss(), rowpctile()

Statistical Functions

  • rank(), count(), mean(), sum(), max(), min(), sd()
  • median(), mode(), iqr(), kurt(), skew(), mad(), mdev()
  • pc(), pctile(), std(), total()

Utility Functions

  • tag(), group(), seq(), anycount(), anymatch(), anyvalue()
  • concat(), cut(), diff(), ends(), fill()

💡 Migration Recommendation

For new projects, we recommend using the unified PyStataR package which provides a comprehensive suite of Stata-equivalent commands:

pip install py-stata-commands
from py_stata_commands import egen
df['rank_var'] = egen.rank(df['income'])

Why Consider PyStataR?

  • Single installation for all Stata-equivalent commands (tabulate, egen, reghdfe, winsor2)
  • Consistent API across all modules
  • Enhanced documentation and examples
  • Active development and long-term support

PyStataR Repository: https://github.com/brycewang-stanford/PyStataR

Documentation & Examples

For comprehensive examples and function documentation, see:

🔧 Project Status

PyEgen will continue to be maintained for existing users, but new feature development will primarily focus on PyStataR. This ensures:

  • ✅ Bug fixes and compatibility updates for PyEgen
  • ✅ Stable API for existing codebases
  • 🚀 Enhanced features and new capabilities in PyStataR

Installation & Requirements

pip install pyegen

Requirements:

  • Python 3.7+
  • pandas >= 1.3.0
  • numpy >= 1.20.0

🤝 Contributing

We welcome contributions! For major changes, please consider contributing to PyStataR for maximum impact.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Related Projects

  • PyStataR - Unified Stata-equivalent commands and R functions (recommended for new projects)
  • StatsPAI - StatsPAI = Stats + Econometrics + ML + AI + LLMs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyegen-0.2.0.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyegen-0.2.0-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file pyegen-0.2.0.tar.gz.

File metadata

  • Download URL: pyegen-0.2.0.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pyegen-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2376b06d0502743ad794a490e64e27bd3e4853645db2fd45177777512deff928
MD5 ff9ef6eb53442bc4f03e7cabd713d40e
BLAKE2b-256 06771e9503e0ccdd0213c37e0c3da5dc8883891a2e8157f2409a96e0acb75be1

See more details on using hashes here.

File details

Details for the file pyegen-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pyegen-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pyegen-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b3cce299ebe6506058d49e92e72e544c4a75528cd1aedfadd4a3d2e455af0128
MD5 ba1aec95b93eba75a9483232b0717bda
BLAKE2b-256 45b95281b341dbc2d64d8a47a43411185025d4c01f7e419940005eb0911cae3c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page