High-performance statistical operations for Polars DataFrames
Project description
causers
A high-performance statistical package for Polars DataFrames, powered by Rust.
๐ Overview
causers provides blazing-fast statistical operations for Polars DataFrames, leveraging Rust's performance through PyO3 bindings. Designed for data scientists and analysts who need production-grade performance without sacrificing ease of use.
โจ Key Features
- ๐๏ธ High Performance: Linear regression on 1M rows in ~250ms with HC3 standard errors
- ๐ Multiple Regression: Support for multiple covariates with matrix-based OLS
- ๐ฎ Logistic Regression: Binary outcome regression with Newton-Raphson MLE
- ๐ Robust Standard Errors: HC3 heteroskedasticity-consistent standard errors included
- ๐ฏ Flexible Models: Optional intercept for fully saturated models
- ๐ข Clustered Standard Errors: Cluster-robust SE for panel/grouped data
- ๐ Bootstrap Methods: Wild cluster bootstrap (linear) and score bootstrap (logistic)
- ๐งช Synthetic DID: Synthetic Difference-in-Differences for causal inference with panel data
- ๐ฏ Synthetic Control: Classic SC with 4 method variants (traditional, penalized, robust, augmented)
- ๐ง Native Polars Integration: Zero-copy operations on Polars DataFrames
- ๐ฆ Rust-Powered: Core computations in Rust for maximum throughput
- ๐ Pythonic API: Clean, intuitive interface with full type hints
- ๐ก๏ธ Production Ready: Comprehensive test coverage, security rating B+
- ๐ Cross-Platform: Works on Linux, macOS (Intel/ARM), and Windows
๐ฆ Installation
From PyPI (Recommended)
pip install causers
From Source (Development)
# Prerequisites: Python 3.8+ and Rust 1.70+
git clone https://github.com/causers/causers.git
cd causers
# Install build dependencies
pip install maturin polars numpy
# Build and install in development mode
maturin develop --release
Quick Start
For comprehensive examples demonstrating all causers functions with realistic data, see the notebook:
๐ examples/basic_examples.ipynb
The notebook includes:
| Function | Description |
|---|---|
linear_regression() |
OLS with clustered standard errors |
logistic_regression() |
Maximum likelihood with clustered SEs |
synthetic_control() |
Abadie-style synthetic control method |
synthetic_did() |
Synthetic difference-in-differences |
All examples use reproducible random seeds and include interpretation guidance.
๐๏ธ Architecture
graph TD
A[Python API] --> B[PyO3 Bridge]
B --> C[Rust Core]
C --> D[Statistical Engine]
E[Polars DataFrame] --> B
D --> F[Results]
F --> A
Project Structure
causers/
โโโ src/ # Rust source code
โ โโโ lib.rs # PyO3 bindings and module definition
โ โโโ stats.rs # Linear regression (OLS)
โ โโโ logistic.rs # Logistic regression (MLE)
โ โโโ cluster.rs # Clustered SE and bootstrap
โ โโโ linalg.rs # Linear algebra utilities (faer integration)
โ โโโ sdid.rs # Synthetic Difference-in-Differences
โ โโโ synth_control.rs # Synthetic Control methods
โโโ python/ # Python package
โโโ tests/ # Comprehensive test suite (193+ tests)
โโโ examples/ # Usage examples and benchmarks
โโโ docs/ # Sphinx documentation
โโโ scripts/ # Development and build scripts
๐ ๏ธ Development
Prerequisites
- Python 3.8 or higher
- Rust 1.70 or higher
- Polars 0.52 or higher
Building from Source
# Clone the repository
git clone https://github.com/causers/causers.git
cd causers
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install development dependencies
pip install -e ".[dev]"
# Build the Rust extension
maturin develop --release
Running Tests
# Run all tests with coverage
pytest tests/ --cov=causers --cov-report=html
# Run specific test categories
pytest tests/test_performance.py -v # Performance benchmarks
pytest tests/test_edge_cases.py -v # Edge case handling
# Run Rust tests
cargo test
Code Quality
# Format Python code
black python/ tests/
# Lint Python code
ruff check python/ tests/
# Type check
mypy python/
# Format Rust code
cargo fmt
# Lint Rust code
cargo clippy
๐ License
MIT License - see LICENSE file for details.
๐ Acknowledgments
- Polars for the excellent DataFrame library
- PyO3 for seamless Python-Rust integration
- maturin for simplified packaging
๐ Resources
๐ Found a Bug?
Please open an issue with:
- Minimal reproducible example
- Expected vs actual behavior
- Environment details (OS, Python version, etc.)
Made with โค๏ธ and ๐ฆ by the causers team
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file causers-0.6.0.tar.gz.
File metadata
- Download URL: causers-0.6.0.tar.gz
- Upload date:
- Size: 900.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65ba5c5595f6942776e8d121bfb6e9101f3df212346f8c74b4828f03e24d971c
|
|
| MD5 |
0727a1d7f29c675162831162a779efc6
|
|
| BLAKE2b-256 |
375097eeb87764a32f39df96bcbc171da9767c2bd0787c1632e6248b57c8fec2
|
File details
Details for the file causers-0.6.0-cp38-abi3-manylinux_2_39_x86_64.whl.
File metadata
- Download URL: causers-0.6.0-cp38-abi3-manylinux_2_39_x86_64.whl
- Upload date:
- Size: 4.6 MB
- Tags: CPython 3.8+, manylinux: glibc 2.39+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a7477193d261255708a803ba470e91d7102d6bfdbce007055dc2c73f452262ed
|
|
| MD5 |
5b35d3c1228cf41f260f1e3122c855a7
|
|
| BLAKE2b-256 |
8ee7cd1db50cc5125240597748e53ecf822bd396117d98aa1e6fc0824ebbb36f
|