Python implementation of Stata's egen command for pandas DataFrames
Project description
PyEgen
Python implementation of Stata's egen command for pandas DataFrames. This package provides Stata-style data manipulation functions, making it easier for researchers to transition from Stata to Python while maintaining familiar syntax and functionality.
Quick Start
pip install pyegen
import pandas as pd
import pyegen as egen
# Create sample data
df = pd.DataFrame({
'group': ['A', 'A', 'B', 'B', 'C', 'C'],
'value': [10, 20, 30, 40, 50, 60]
})
# Generate ranks
df['rank'] = egen.rank(df['value'])
# Calculate group means
df['group_mean'] = egen.mean(df['value'], by=df['group'])
# Row-wise operations
df['row_sum'] = egen.rowtotal(df, ['value'])
Available Functions
PyEgen supports 40+ functions covering all major Stata egen capabilities:
Row-wise Functions
rowmean(),rowtotal(),rowmax(),rowmin(),rowsd()rowfirst(),rowlast(),rowmedian(),rowmiss(),rownonmiss(),rowpctile()
Statistical Functions
rank(),count(),mean(),sum(),max(),min(),sd()median(),mode(),iqr(),kurt(),skew(),mad(),mdev()pc(),pctile(),std(),total()
Utility Functions
tag(),group(),seq(),anycount(),anymatch(),anyvalue()concat(),cut(),diff(),ends(),fill()
💡 Migration Recommendation
For new projects, we recommend using the unified PyStataR package which provides a comprehensive suite of Stata-equivalent commands:
pip install py-stata-commands
from py_stata_commands import egen
df['rank_var'] = egen.rank(df['income'])
Why Consider PyStataR?
- Single installation for all Stata-equivalent commands (tabulate, egen, reghdfe, winsor2)
- Consistent API across all modules
- Enhanced documentation and examples
- Active development and long-term support
PyStataR Repository: https://github.com/brycewang-stanford/PyStataR
Documentation & Examples
For comprehensive examples and function documentation, see:
🔧 Project Status
PyEgen will continue to be maintained for existing users, but new feature development will primarily focus on PyStataR. This ensures:
- ✅ Bug fixes and compatibility updates for PyEgen
- ✅ Stable API for existing codebases
- 🚀 Enhanced features and new capabilities in PyStataR
Installation & Requirements
pip install pyegen
Requirements:
- Python 3.7+
- pandas >= 1.3.0
- numpy >= 1.20.0
🤝 Contributing
We welcome contributions! For major changes, please consider contributing to PyStataR for maximum impact.
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🔗 Related Projects
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyegen-0.2.0.tar.gz.
File metadata
- Download URL: pyegen-0.2.0.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2376b06d0502743ad794a490e64e27bd3e4853645db2fd45177777512deff928
|
|
| MD5 |
ff9ef6eb53442bc4f03e7cabd713d40e
|
|
| BLAKE2b-256 |
06771e9503e0ccdd0213c37e0c3da5dc8883891a2e8157f2409a96e0acb75be1
|
File details
Details for the file pyegen-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pyegen-0.2.0-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3cce299ebe6506058d49e92e72e544c4a75528cd1aedfadd4a3d2e455af0128
|
|
| MD5 |
ba1aec95b93eba75a9483232b0717bda
|
|
| BLAKE2b-256 |
45b95281b341dbc2d64d8a47a43411185025d4c01f7e419940005eb0911cae3c
|