Skip to main content

Fast normal distribution functions (CDF, PPF, PDF) for Polars DataFrames using Rust

Project description

Polars Normal Stats

Fast normal distribution functions (CDF, PPF, PDF) for Polars DataFrames, implemented as a Polars plugin in Rust.

This plugin provides highly optimized implementations of the Normal (Gaussian) distribution functions, offering significant speedups over calling SciPy's norm functions within a Polars map_batches or apply (now map_elements).

Features

  • normal_cdf(x, mean=0.0, std=1.0): Cumulative Distribution Function.
  • normal_ppf(p, mean=0.0, std=1.0): Percent Point Function (Inverse CDF).
  • normal_pdf(x, mean=0.0, std=1.0): Probability Density Function.
  • Fully compatible with Polars' lazy execution and expression API.
  • Supports both literal values and Polars expressions for mean and std.

Installation

Install using uv:

uv add polars-normal-stats

Install using pip:

pip install polars-normal-stats

(Note: Ensure you have polars installed as well.)

Usage

The functions are designed to work directly within Polars expressions.

import polars as pl
from polars_normal_stats import normal_cdf, normal_ppf, normal_pdf

df = pl.DataFrame({
    "x": [-1.0, 0.0, 1.0],
    "p": [0.1, 0.5, 0.9]
})

result = df.select([
    normal_cdf(pl.col("x")).alias("cdf"),
    normal_ppf(pl.col("p"), mean=10.0, std=2.0).alias("ppf_shifted"),
    normal_pdf(pl.col("x"), mean=0.0, std=1.0).alias("pdf")
])

print(result)

Lazy Execution

Since these functions return Polars expressions, they integrate seamlessly into Polars' lazy API. This allows Polars to optimize the entire query plan, including these statistical operations.

lazy_result = (
    pl.scan_parquet("data.parquet")
    .with_columns(
        z_score = normal_cdf(pl.col("value"), mean=pl.col("mean"), std=pl.col("std"))
    )
    .collect()
)

Benchmarks

The plugin is significantly faster than using SciPy's normal distribution functions via Polars' map_batches. Below are the results comparing the execution time for varying data sizes.

Results averaged over 10 iterations:

Function Size SciPy (s) Plugin (s) Speedup
CDF 100,000 0.0025 0.0019 1.29x
PPF 100,000 0.0035 0.0016 2.23x
PDF 100,000 0.0018 0.0006 2.86x
CDF 1,000,000 0.0256 0.0191 1.34x
PPF 1,000,000 0.0355 0.0147 2.42x
PDF 1,000,000 0.0234 0.0064 3.65x
CDF 10,000,000 0.2702 0.1903 1.42x
PPF 10,000,000 0.3637 0.1436 2.53x
PDF 10,000,000 0.2520 0.0604 4.17x
CDF 25,000,000 0.6841 0.4680 1.46x
PPF 25,000,000 0.9122 0.3587 2.54x
PDF 25,000,000 0.6424 0.1506 4.27x

Benchmarks performed on 10,000,000+ rows show up to a 4.2x speedup for PDF calculations.

Credits

This plugin was developed using the excellent polars-xdt as a template and acknowledges the work of Marco Gorelli, Ritchie Vink, and the Polars contributors for making Python-Rust plugin development accessible.

It also relies on the statrs crate for statistical computations and PyO3 for Rust-Python bindings.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_normal_stats-0.1.0.tar.gz (41.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_normal_stats-0.1.0-cp39-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.9+Windows x86-64

File details

Details for the file polars_normal_stats-0.1.0.tar.gz.

File metadata

  • Download URL: polars_normal_stats-0.1.0.tar.gz
  • Upload date:
  • Size: 41.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for polars_normal_stats-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cb70b6f90b477abb2889bc0a876a9da63dd2576b9df0ccb5d9fdd755200ec693
MD5 dcb810132ef625f586c5f467cdb81e20
BLAKE2b-256 b1fd2942d29f834778146b0c755a92e6a88a09eec5e643d90ad2d7281794fed4

See more details on using hashes here.

File details

Details for the file polars_normal_stats-0.1.0-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: polars_normal_stats-0.1.0-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for polars_normal_stats-0.1.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 859be5e583c234df2d2c174a59f7684831fd5f39af6d33284d0e1953c2fdf908
MD5 de531dc119b4f98ce2fcaff13c78b97f
BLAKE2b-256 85fb0846409069c1ea634cc08e0ab9b12334d08649ed46defdbcb61d747b90b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page