High-performance statistical operations for Polars DataFrames
Project description
causers
A high-performance statistical package for Polars (and pandas) DataFrames, powered by Rust.
🚀 Overview
causers provides blazing-fast statistical operations for both Polars and pandas DataFrames, leveraging Rust's performance through PyO3 bindings. Designed for data scientists and analysts who need production-grade performance without sacrificing ease of use.
✨ Key Features
- 🏎️ High Performance: Linear regression on 1M rows in ~250ms with HC3 standard errors
- 📊 Multiple Regression: Support for multiple covariates with matrix-based OLS
- 🔮 Logistic Regression: Binary outcome regression with Newton-Raphson MLE, fixed effects via Mundlak strategy
- 📈 Robust Standard Errors: HC3 heteroskedasticity-consistent standard errors included
- 🎯 Flexible Models: Optional intercept for fully saturated models
- 🏢 Clustered Standard Errors: Cluster-robust SE for panel/grouped data
- 🔄 Bootstrap Methods: Wild cluster bootstrap (linear) and score bootstrap (logistic)
- 📋 Two-Way Fixed Effects: Panel data regression with entity and time fixed effects
- 🧪 Synthetic DID: Synthetic Difference-in-Differences for causal inference with panel data
- 🎯 Synthetic Control: Classic SC with 4 method variants (traditional, penalized, robust, augmented)
- 🔬 Two-Stage Least Squares (IV): Instrumental variables estimation for causal inference with endogeneity
- 🤖 Double Machine Learning: Debiased/orthogonalized ML for causal inference with cross-fitting
- 🔧 Native Polars Integration: Zero-copy operations on Polars DataFrames
- 🐼 pandas Support: Seamlessly pass pandas DataFrames - automatic conversion with minimal overhead
- 🦀 Rust-Powered: Core computations in Rust for maximum throughput
- 🐍 Pythonic API: Clean, intuitive interface with full type hints
- 🌍 Cross-Platform: Works on Linux, macOS (Intel/ARM), and Windows
📦 Installation
From PyPI (Recommended)
pip install causers
# With pandas support
pip install causers[pandas]
From Source (Development)
# Prerequisites: Python 3.8+ and Rust 1.70+
git clone https://github.com/causers/causers.git
cd causers
# Install build dependencies
pip install "maturin>=1.4,<2.0" "polars>=0.52" numpy
# Build and install in development mode
maturin develop --release
Quick Start
For comprehensive examples demonstrating all causers functions with realistic data, see the notebook:
📓 examples/basic_examples.ipynb
The notebook includes:
| Function | Description |
|---|---|
dml() |
Double Machine Learning |
linear_regression() |
OLS with clustered standard errors |
logistic_regression() |
Maximum likelihood with clustered SEs |
two_stage_least_squares() |
IV/2SLS for causal inference with endogeneity |
synthetic_control() |
Synthetic control method |
synthetic_did() |
Synthetic difference-in-differences |
🛠️ Development
Prerequisites
- Python 3.8 or higher
- Rust 1.70 or higher
- Polars 0.52 or higher
Building from Source
# Clone the repository
git clone https://github.com/causers/causers.git
cd causers
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install development dependencies
pip install -e ".[dev]"
# Build the Rust extension
maturin develop --release
Running Tests
# Install test dependencies (required for running tests)
pip install -e ".[test]"
# Run all tests with coverage
pytest tests/ --cov=causers --cov-report=html
# Run specific test categories
pytest tests/test_performance.py -v # Performance benchmarks
pytest tests/test_edge_cases.py -v # Edge case handling
# Run Rust tests
cargo test
Code Quality
# Format Python code
black python/ tests/
# Lint Python code
ruff check python/ tests/
# Type check
mypy python/
# Format Rust code
cargo fmt
# Lint Rust code
cargo clippy
📜 License
MIT License - see LICENSE file for details.
🙏 Acknowledgments
- Polars for the excellent DataFrame library
- PyO3 for seamless Python-Rust integration
- maturin for simplified packaging
📚 Resources
🐛 Found a Bug?
Please open an issue with:
- Minimal reproducible example
- Expected vs actual behavior
- Environment details (OS, Python version, etc.)
Made with ❤️ and 🦀 by the causers team
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file causers-0.7.0.tar.gz.
File metadata
- Download URL: causers-0.7.0.tar.gz
- Upload date:
- Size: 921.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd2d62217a18e8fde93a37006fe55e3a4b7bd8b1e74f4e127331391070e2752d
|
|
| MD5 |
94dd8f7b2d8797d28d5c8c1c4bbc4aea
|
|
| BLAKE2b-256 |
a690e9afac2843196041ccd4674d9145bce24a6bf79a4c11e6ee2cec6f978edf
|
File details
Details for the file causers-0.7.0-cp38-abi3-manylinux_2_39_x86_64.whl.
File metadata
- Download URL: causers-0.7.0-cp38-abi3-manylinux_2_39_x86_64.whl
- Upload date:
- Size: 4.7 MB
- Tags: CPython 3.8+, manylinux: glibc 2.39+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
492f9eb069078614e2114f10cb1f7e74f367910febc5c7dae33f202937896cf0
|
|
| MD5 |
508a69dd4d59a732cbe8bed9e7b76b87
|
|
| BLAKE2b-256 |
558c89879ee17206e3291a83cd83114d4e741172f52f1a7caea4d214103896e7
|