Skip to main content

Fast normal distribution functions (CDF, PPF, PDF) for Polars DataFrames using Rust

Project description

Polars Normal Stats

Fast normal distribution functions (CDF, PPF, PDF) for Polars DataFrames, implemented as a Polars plugin in Rust.

This plugin provides highly optimized implementations of the Normal (Gaussian) distribution functions, offering significant speedups over calling SciPy's norm functions within a Polars map_batches or apply (now map_elements).

Features

  • normal_cdf(x, mean=0.0, std=1.0): Cumulative Distribution Function.
  • normal_ppf(p, mean=0.0, std=1.0): Percent Point Function (Inverse CDF).
  • normal_pdf(x, mean=0.0, std=1.0): Probability Density Function.
  • Fully compatible with Polars' lazy execution and expression API.
  • Optimized using Rust kwargs for distribution parameters.

Installation

Install using uv:

uv add polars-normal-stats

Install using pip:

pip install polars-normal-stats

(Note: Ensure you have polars installed as well.)

Usage

The functions are designed to work directly within Polars expressions.

import polars as pl
from polars_normal_stats import normal_cdf, normal_ppf, normal_pdf

df = pl.DataFrame({
    "x": [-1.0, 0.0, 1.0],
    "p": [0.1, 0.5, 0.9]
})

result = df.select([
    normal_cdf(pl.col("x")).alias("cdf"),
    normal_ppf(pl.col("p"), mean=10.0, std=2.0).alias("ppf_shifted"),
    normal_pdf(pl.col("x"), mean=0.0, std=1.0).alias("pdf")
])

print(result)

Lazy Execution

Since these functions return Polars expressions, they integrate seamlessly into Polars' lazy API. This allows Polars to optimize the entire query plan, including these statistical operations.

lazy_result = (
    pl.scan_parquet("data.parquet")
    .with_columns(
        z_score = normal_cdf(pl.col("value"), mean=100.0, std=15.0)
    )
    .collect()
)

Benchmarks

The plugin is significantly faster than using SciPy's normal distribution functions via Polars' map_batches. Below are the results comparing the execution time for varying data sizes.

Results averaged over 10 iterations:

Function Size SciPy (s) Plugin (s) Speedup
CDF 100,000 0.0026 0.0017 1.47x
PPF 100,000 0.0036 0.0015 2.32x
PDF 100,000 0.0020 0.0005 3.94x
CDF 1,000,000 0.0270 0.0164 1.64x
PPF 1,000,000 0.0367 0.0148 2.47x
PDF 1,000,000 0.0249 0.0046 5.44x
CDF 10,000,000 0.2767 0.1596 1.73x
PPF 10,000,000 0.3781 0.1445 2.62x
PDF 10,000,000 0.2632 0.0432 6.10x
CDF 25,000,000 0.7047 0.3971 1.77x
PPF 25,000,000 0.9460 0.3607 2.62x
PDF 25,000,000 0.6588 0.1088 6.06x

Benchmarks performed on 25,000,000 rows show up to a 6.1x speedup for PDF calculations.

Credits

This plugin was developed using the excellent polars-xdt as a template and acknowledges the work of Marco Gorelli, Ritchie Vink, and the Polars contributors for making Python-Rust plugin development accessible.

It also relies on the statrs crate for statistical computations and PyO3 for Rust-Python bindings.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_normal_stats-0.2.0.tar.gz (42.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_normal_stats-0.2.0-cp39-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.9+Windows x86-64

File details

Details for the file polars_normal_stats-0.2.0.tar.gz.

File metadata

  • Download URL: polars_normal_stats-0.2.0.tar.gz
  • Upload date:
  • Size: 42.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for polars_normal_stats-0.2.0.tar.gz
Algorithm Hash digest
SHA256 eec66f0e2ca40fbf19cb5621c243e380b0fcff428a55b567a292e6f9cf6c3eba
MD5 7f77a2b94f7f55261f362d5f647806cb
BLAKE2b-256 7a286479079c662ef13c1548f3ad2daef0be1413eab5cf281fd6c15a5f615450

See more details on using hashes here.

File details

Details for the file polars_normal_stats-0.2.0-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: polars_normal_stats-0.2.0-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 4.4 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for polars_normal_stats-0.2.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 9cc03b53652f89b38b88c2d719186aebc56041cb09bade4009070ad40f662be4
MD5 f8468f590bfdfbae124b536a807bceff
BLAKE2b-256 2cfff30f618abea5cef51a565679bcfae485e31e5f8e9d1391ca30d73364795b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page