Skip to main content

High-performance statistical operations for Polars DataFrames

Project description

causers

Build Status PyPI Version Python Versions License: MIT Coverage: 100% Documentation Status

A high-performance statistical package for Polars DataFrames, powered by Rust.

๐Ÿš€ Overview

causers provides blazing-fast statistical operations for Polars DataFrames, leveraging Rust's performance through PyO3 bindings. Designed for data scientists and analysts who need production-grade performance without sacrificing ease of use.

โœจ Key Features

  • ๐ŸŽ๏ธ High Performance: Linear regression on 1M rows in ~250ms with HC3 standard errors
  • ๐Ÿ“Š Multiple Regression: Support for multiple covariates with matrix-based OLS
  • ๐Ÿ”ฎ Logistic Regression: Binary outcome regression with Newton-Raphson MLE
  • ๐Ÿ“ˆ Robust Standard Errors: HC3 heteroskedasticity-consistent standard errors included
  • ๐ŸŽฏ Flexible Models: Optional intercept for fully saturated models
  • ๐Ÿข Clustered Standard Errors: Cluster-robust SE for panel/grouped data
  • ๐Ÿ”„ Bootstrap Methods: Wild cluster bootstrap (linear) and score bootstrap (logistic)
  • ๐Ÿงช Synthetic DID: Synthetic Difference-in-Differences for causal inference with panel data
  • ๐ŸŽฏ Synthetic Control: Classic SC with 4 method variants (traditional, penalized, robust, augmented)
  • ๐Ÿ”ง Native Polars Integration: Zero-copy operations on Polars DataFrames
  • ๐Ÿฆ€ Rust-Powered: Core computations in Rust for maximum throughput
  • ๐Ÿ Pythonic API: Clean, intuitive interface with full type hints
  • ๐Ÿ›ก๏ธ Production Ready: Comprehensive test coverage, security rating B+
  • ๐ŸŒ Cross-Platform: Works on Linux, macOS (Intel/ARM), and Windows

๐Ÿ“ฆ Installation

From PyPI (Recommended)

pip install causers

From Source (Development)

# Prerequisites: Python 3.8+ and Rust 1.70+
git clone https://github.com/causers/causers.git
cd causers

# Install build dependencies
pip install maturin polars numpy

# Build and install in development mode
maturin develop --release

Quick Start

For comprehensive examples demonstrating all causers functions with realistic data, see the notebook:

๐Ÿ““ examples/basic_examples.ipynb

The notebook includes:

Function Description
linear_regression() OLS with clustered standard errors
logistic_regression() Maximum likelihood with clustered SEs
synthetic_control() Abadie-style synthetic control method
synthetic_did() Synthetic difference-in-differences

All examples use reproducible random seeds and include interpretation guidance.

๐Ÿ—๏ธ Architecture

graph TD
    A[Python API] --> B[PyO3 Bridge]
    B --> C[Rust Core]
    C --> D[Statistical Engine]
    
    E[Polars DataFrame] --> B
    D --> F[Results]
    F --> A

Project Structure

causers/
โ”œโ”€โ”€ src/                    # Rust source code
โ”‚   โ”œโ”€โ”€ lib.rs             # PyO3 bindings and module definition
โ”‚   โ”œโ”€โ”€ stats.rs           # Linear regression (OLS)
โ”‚   โ”œโ”€โ”€ logistic.rs        # Logistic regression (MLE)
โ”‚   โ”œโ”€โ”€ cluster.rs         # Clustered SE and bootstrap
โ”‚   โ”œโ”€โ”€ linalg.rs          # Linear algebra utilities (faer integration)
โ”‚   โ”œโ”€โ”€ sdid.rs            # Synthetic Difference-in-Differences
โ”‚   โ””โ”€โ”€ synth_control.rs   # Synthetic Control methods
โ”œโ”€โ”€ python/                # Python package
โ”œโ”€โ”€ tests/                 # Comprehensive test suite (193+ tests)
โ”œโ”€โ”€ examples/              # Usage examples and benchmarks
โ”œโ”€โ”€ docs/                  # Sphinx documentation
โ”œโ”€โ”€ scripts/               # Development and build scripts

๐Ÿ› ๏ธ Development

Prerequisites

  • Python 3.8 or higher
  • Rust 1.70 or higher
  • Polars 0.52 or higher

Building from Source

# Clone the repository
git clone https://github.com/causers/causers.git
cd causers

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Build the Rust extension
maturin develop --release

Running Tests

# Run all tests with coverage
pytest tests/ --cov=causers --cov-report=html

# Run specific test categories
pytest tests/test_performance.py -v  # Performance benchmarks
pytest tests/test_edge_cases.py -v   # Edge case handling

# Run Rust tests
cargo test

Code Quality

# Format Python code
black python/ tests/

# Lint Python code
ruff check python/ tests/

# Type check
mypy python/

# Format Rust code
cargo fmt

# Lint Rust code
cargo clippy

๐Ÿ“œ License

MIT License - see LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Polars for the excellent DataFrame library
  • PyO3 for seamless Python-Rust integration
  • maturin for simplified packaging

๐Ÿ“š Resources

๐Ÿ› Found a Bug?

Please open an issue with:

  • Minimal reproducible example
  • Expected vs actual behavior
  • Environment details (OS, Python version, etc.)

Made with โค๏ธ and ๐Ÿฆ€ by the causers team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

causers-0.6.0.tar.gz (900.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

causers-0.6.0-cp38-abi3-manylinux_2_39_x86_64.whl (4.6 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.39+ x86-64

File details

Details for the file causers-0.6.0.tar.gz.

File metadata

  • Download URL: causers-0.6.0.tar.gz
  • Upload date:
  • Size: 900.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.10.2

File hashes

Hashes for causers-0.6.0.tar.gz
Algorithm Hash digest
SHA256 65ba5c5595f6942776e8d121bfb6e9101f3df212346f8c74b4828f03e24d971c
MD5 0727a1d7f29c675162831162a779efc6
BLAKE2b-256 375097eeb87764a32f39df96bcbc171da9767c2bd0787c1632e6248b57c8fec2

See more details on using hashes here.

File details

Details for the file causers-0.6.0-cp38-abi3-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for causers-0.6.0-cp38-abi3-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 a7477193d261255708a803ba470e91d7102d6bfdbce007055dc2c73f452262ed
MD5 5b35d3c1228cf41f260f1e3122c855a7
BLAKE2b-256 8ee7cd1db50cc5125240597748e53ecf822bd396117d98aa1e6fc0824ebbb36f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page