Skip to main content

High-performance statistical operations for Polars DataFrames

Project description

causers

PyPI Version Python Versions License: MIT Coverage: 100% Documentation Status

A high-performance statistical package for Polars (and pandas) DataFrames, powered by Rust.

🚀 Overview

causers provides blazing-fast statistical operations for both Polars and pandas DataFrames, leveraging Rust's performance through PyO3 bindings. Designed for data scientists and analysts who need production-grade performance without sacrificing ease of use.

✨ Key Features

  • 🏎️ High Performance: Linear regression on 1M rows in ~250ms with HC3 standard errors
  • 📊 Multiple Regression: Support for multiple covariates with matrix-based OLS
  • 🔮 Logistic Regression: Binary outcome regression with Newton-Raphson MLE, fixed effects via Mundlak strategy
  • 📈 Robust Standard Errors: HC3 heteroskedasticity-consistent standard errors included
  • 🎯 Flexible Models: Optional intercept for fully saturated models
  • 🏢 Clustered Standard Errors: Cluster-robust SE for panel/grouped data
  • 🔄 Bootstrap Methods: Wild cluster bootstrap (linear) and score bootstrap (logistic)
  • 📋 Two-Way Fixed Effects: Panel data regression with entity and time fixed effects
  • 🧪 Synthetic DID: Synthetic Difference-in-Differences for causal inference with panel data
  • 🎯 Synthetic Control: Classic SC with 4 method variants (traditional, penalized, robust, augmented)
  • 🔬 Two-Stage Least Squares (IV): Instrumental variables estimation for causal inference with endogeneity
  • 🤖 Double Machine Learning: Debiased/orthogonalized ML for causal inference with cross-fitting
  • 🔧 Native Polars Integration: Zero-copy operations on Polars DataFrames
  • 🐼 pandas Support: Seamlessly pass pandas DataFrames - automatic conversion with minimal overhead
  • 🦀 Rust-Powered: Core computations in Rust for maximum throughput
  • 🐍 Pythonic API: Clean, intuitive interface with full type hints
  • 🌍 Cross-Platform: Works on Linux, macOS (Intel/ARM), and Windows

📦 Installation

From PyPI (Recommended)

pip install causers

# With pandas support
pip install causers[pandas]

From Source (Development)

# Prerequisites: Python 3.8+ and Rust 1.70+
git clone https://github.com/causers/causers.git
cd causers

# Install build dependencies
pip install "maturin>=1.4,<2.0" "polars>=0.52" numpy

# Build and install in development mode
maturin develop --release

Quick Start

For comprehensive examples demonstrating all causers functions with realistic data, see the notebook:

📓 examples/basic_examples.ipynb

The notebook includes:

Function Description
dml() Double Machine Learning
linear_regression() OLS with clustered standard errors
logistic_regression() Maximum likelihood with clustered SEs
two_stage_least_squares() IV/2SLS for causal inference with endogeneity
synthetic_control() Synthetic control method
synthetic_did() Synthetic difference-in-differences

🛠️ Development

Prerequisites

  • Python 3.8 or higher
  • Rust 1.70 or higher
  • Polars 0.52 or higher

Building from Source

# Clone the repository
git clone https://github.com/causers/causers.git
cd causers

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Build the Rust extension
maturin develop --release

Running Tests

# Install test dependencies (required for running tests)
pip install -e ".[test]"

# Run all tests with coverage
pytest tests/ --cov=causers --cov-report=html

# Run specific test categories
pytest tests/test_performance.py -v  # Performance benchmarks
pytest tests/test_edge_cases.py -v   # Edge case handling

# Run Rust tests
cargo test

Code Quality

# Format Python code
black python/ tests/

# Lint Python code
ruff check python/ tests/

# Type check
mypy python/

# Format Rust code
cargo fmt

# Lint Rust code
cargo clippy

📜 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

  • Polars for the excellent DataFrame library
  • PyO3 for seamless Python-Rust integration
  • maturin for simplified packaging

📚 Resources

🐛 Found a Bug?

Please open an issue with:

  • Minimal reproducible example
  • Expected vs actual behavior
  • Environment details (OS, Python version, etc.)

Made with ❤️ and 🦀 by the causers team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

causers-0.7.0.tar.gz (921.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

causers-0.7.0-cp38-abi3-manylinux_2_39_x86_64.whl (4.7 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.39+ x86-64

File details

Details for the file causers-0.7.0.tar.gz.

File metadata

  • Download URL: causers-0.7.0.tar.gz
  • Upload date:
  • Size: 921.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.10.2

File hashes

Hashes for causers-0.7.0.tar.gz
Algorithm Hash digest
SHA256 fd2d62217a18e8fde93a37006fe55e3a4b7bd8b1e74f4e127331391070e2752d
MD5 94dd8f7b2d8797d28d5c8c1c4bbc4aea
BLAKE2b-256 a690e9afac2843196041ccd4674d9145bce24a6bf79a4c11e6ee2cec6f978edf

See more details on using hashes here.

File details

Details for the file causers-0.7.0-cp38-abi3-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for causers-0.7.0-cp38-abi3-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 492f9eb069078614e2114f10cb1f7e74f367910febc5c7dae33f202937896cf0
MD5 508a69dd4d59a732cbe8bed9e7b76b87
BLAKE2b-256 558c89879ee17206e3291a83cd83114d4e741172f52f1a7caea4d214103896e7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page