High-performance statistical testing and regression for Polars DataFrames, powered by Rust
Project description
polars-statistics
Note: This extension is in early stage development. APIs may change and some features are experimental.
High-performance statistical testing and regression for Polars DataFrames, powered by Rust.
Features
- Native Polars Expressions: Full support for
group_by,over, and lazy evaluation - Statistical Tests: Parametric, non-parametric, distributional, and forecast comparison tests
- Regression Models: OLS, Ridge, Elastic Net, WLS, GLMs, ALM (24+ distributions)
- Formula Syntax: R-style formulas with polynomial and interaction effects
- High Performance: Rust-powered with zero-copy data transfer
Installation
pip install polars-statistics
Quick Start
All functions work as Polars expressions, integrating with group_by and over:
import polars as pl
import polars_statistics as ps
df = pl.DataFrame({
"group": ["A"] * 50 + ["B"] * 50,
"y": [...],
"x1": [...],
"x2": [...],
})
# Run OLS regression per group
result = df.group_by("group").agg(
ps.ols("y", "x1", "x2").alias("model")
)
# Extract results from struct
result.with_columns(
pl.col("model").struct.field("r_squared"),
pl.col("model").struct.field("coefficients"),
)
Statistical Tests
Statistical tests are powered by anofox-statistics, providing full API parity with R's statistical functions and validated against R implementations.
# Parametric tests
ps.ttest_ind("treatment", "control", alternative="two-sided")
ps.ttest_paired("before", "after")
# Non-parametric tests
ps.mann_whitney_u("x", "y")
ps.kruskal_wallis("group1", "group2", "group3")
# Normality tests
ps.shapiro_wilk("x")
# Forecast comparison
ps.diebold_mariano("errors1", "errors2", horizon=1)
All tests return a struct with statistic and p_value fields.
Regression Models
Regression models are powered by anofox-regression, providing validated implementations against R.
Expression API
# Linear models
ps.ols("y", "x1", "x2")
ps.ridge("y", "x1", "x2", lambda_=1.0)
ps.elastic_net("y", "x1", "x2", lambda_=1.0, alpha=0.5)
# GLM models
ps.logistic("y", "x1", "x2") # Binary classification
ps.poisson("y", "x1", "x2") # Count data
# ALM - 24+ distributions
ps.alm("y", "x1", "x2", distribution="laplace") # Robust to outliers
Formula Syntax
R-style formulas with polynomial and interaction effects:
# Main effects + interaction
ps.ols_formula("y ~ x1 * x2") # Expands to: x1 + x2 + x1:x2
# Polynomial regression (centered per group)
ps.ols_formula("y ~ poly(x, 2)")
# Explicit transform
ps.ols_formula("y ~ x1 + I(x^2)")
Predictions with Intervals
df.with_columns(
ps.ols_predict("y", "x1", "x2", interval="prediction", level=0.95)
.over("group").alias("pred")
).unnest("pred") # Columns: prediction, lower, upper
Tidy Coefficient Summary
df.group_by("group").agg(
ps.ols_summary("y", "x1", "x2").alias("coef")
).explode("coef").unnest("coef")
# Columns: term, estimate, std_error, statistic, p_value
Model Classes
For direct model access outside Polars expressions:
from polars_statistics import OLS, Ridge, Logistic, ALM
# Fit model
model = OLS(compute_inference=True).fit(X, y)
print(model.coefficients, model.r_squared, model.p_values)
# ALM with various distributions
alm = ALM.laplace().fit(X, y) # Robust to outliers
Test Model Classes
Statistical tests are also available as model classes with .fit(), .statistic, .p_value, and .summary():
from polars_statistics import TTestInd, ShapiroWilk, KruskalWallis
import numpy as np
# Two-sample t-test
test = TTestInd(alternative="two-sided").fit(x, y)
print(test.statistic, test.p_value)
print(test.summary())
# Normality test
test = ShapiroWilk().fit(x)
print(test.p_value)
# Multi-group comparison
test = KruskalWallis().fit(g1, g2, g3)
print(test.summary())
Available test classes: TTestInd, TTestPaired, BrownForsythe, YuenTest, MannWhitneyU, WilcoxonSignedRank, KruskalWallis, BrunnerMunzel, ShapiroWilk, DAgostino.
API Reference
See docs/API_REFERENCE.md for complete documentation of all functions, parameters, and output structures.
Performance
Built on high-performance Rust libraries:
- faer: Fast linear algebra with SIMD
- Zero-copy: Direct memory sharing between Python and Rust
- Automatic parallelization: For
group_byoperations
Development
git clone https://github.com/DataZooDE/polars-statistics.git
cd polars-statistics
python -m venv .venv && source .venv/bin/activate
pip install maturin numpy polars pytest
maturin develop --release
pytest
License
MIT License - see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polars_statistics-0.2.0.tar.gz.
File metadata
- Download URL: polars_statistics-0.2.0.tar.gz
- Upload date:
- Size: 141.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
47c8b5156d057c4e2f5508e31294fc5c0eadc9465a869251c911c1984541f5c0
|
|
| MD5 |
dbb0ff23ca13d30c5a5fd253e4ab88b3
|
|
| BLAKE2b-256 |
6301e315aa4a6e79f18bb569d7084547e619345b0085bb447786c5be7d4a54c0
|
Provenance
The following attestation bundles were made for polars_statistics-0.2.0.tar.gz:
Publisher:
publish.yml on DataZooDE/polars-statistics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_statistics-0.2.0.tar.gz -
Subject digest:
47c8b5156d057c4e2f5508e31294fc5c0eadc9465a869251c911c1984541f5c0 - Sigstore transparency entry: 761629689
- Sigstore integration time:
-
Permalink:
DataZooDE/polars-statistics@33804b4bcecd7573e669d9504bee334ce708962e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/DataZooDE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@33804b4bcecd7573e669d9504bee334ce708962e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file polars_statistics-0.2.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: polars_statistics-0.2.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 5.7 MB
- Tags: PyPy, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39e9b67ff5da036590c41253c5c7d98f0f6d8074d6514a3fac85db5c64fdc0fa
|
|
| MD5 |
46d04ec85b3d61072edca1b820883be8
|
|
| BLAKE2b-256 |
3bd2f4fb87518be6734860450882fbd5840efaf428f1c5bfad628b2320f110c8
|
Provenance
The following attestation bundles were made for polars_statistics-0.2.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
publish.yml on DataZooDE/polars-statistics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_statistics-0.2.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
39e9b67ff5da036590c41253c5c7d98f0f6d8074d6514a3fac85db5c64fdc0fa - Sigstore transparency entry: 761629868
- Sigstore integration time:
-
Permalink:
DataZooDE/polars-statistics@33804b4bcecd7573e669d9504bee334ce708962e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/DataZooDE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@33804b4bcecd7573e669d9504bee334ce708962e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file polars_statistics-0.2.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: polars_statistics-0.2.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 5.7 MB
- Tags: PyPy, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a778985307482fc11b086e9eb1d5baf26d81102240f5a7ef7c35d877d4d1835
|
|
| MD5 |
42e71384df5b6acef43edb1052d454dc
|
|
| BLAKE2b-256 |
0b5311595ebbaa2c3a8a9a7a02e9da66b0192c3f44dd9a348b05dc698d1298e3
|
Provenance
The following attestation bundles were made for polars_statistics-0.2.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
publish.yml on DataZooDE/polars-statistics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_statistics-0.2.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
8a778985307482fc11b086e9eb1d5baf26d81102240f5a7ef7c35d877d4d1835 - Sigstore transparency entry: 761629733
- Sigstore integration time:
-
Permalink:
DataZooDE/polars-statistics@33804b4bcecd7573e669d9504bee334ce708962e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/DataZooDE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@33804b4bcecd7573e669d9504bee334ce708962e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file polars_statistics-0.2.0-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: polars_statistics-0.2.0-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 7.1 MB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fbce339d865d0c64f1ba9264e6522c614ad22a3e505a051f5e6bf8f697db87ed
|
|
| MD5 |
c0cfe27a7dab3a9f7e56883c937812e7
|
|
| BLAKE2b-256 |
a09b38288930d3ace86e2951db72b8a92fad6283819c377af3b40001c5ae4c92
|
Provenance
The following attestation bundles were made for polars_statistics-0.2.0-cp39-abi3-win_amd64.whl:
Publisher:
publish.yml on DataZooDE/polars-statistics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_statistics-0.2.0-cp39-abi3-win_amd64.whl -
Subject digest:
fbce339d865d0c64f1ba9264e6522c614ad22a3e505a051f5e6bf8f697db87ed - Sigstore transparency entry: 761629824
- Sigstore integration time:
-
Permalink:
DataZooDE/polars-statistics@33804b4bcecd7573e669d9504bee334ce708962e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/DataZooDE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@33804b4bcecd7573e669d9504bee334ce708962e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file polars_statistics-0.2.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: polars_statistics-0.2.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 6.5 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5d8e1835a28a6e0cdc28b4a2663c2aeb9955daa55ab20b97968e63054facb7d
|
|
| MD5 |
e6e261ade8290a7458a89e44e79e78cd
|
|
| BLAKE2b-256 |
5b66520a76890979cf81adcdc5b2e11f3a032082fd587435302ddabd0f2b67be
|
Provenance
The following attestation bundles were made for polars_statistics-0.2.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
publish.yml on DataZooDE/polars-statistics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_statistics-0.2.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
c5d8e1835a28a6e0cdc28b4a2663c2aeb9955daa55ab20b97968e63054facb7d - Sigstore transparency entry: 761629804
- Sigstore integration time:
-
Permalink:
DataZooDE/polars-statistics@33804b4bcecd7573e669d9504bee334ce708962e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/DataZooDE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@33804b4bcecd7573e669d9504bee334ce708962e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file polars_statistics-0.2.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: polars_statistics-0.2.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 5.7 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c057f5e85ac3f842054e7ee01f02eb6aa2c7dfabfc99d6e016fe58b4cdbc1ba0
|
|
| MD5 |
c08e1a97792d6185bab2c75aa1d1686d
|
|
| BLAKE2b-256 |
a12ea2017e217707ef438e78e7b6fde43e31ebea704c1a848f14419c02aa8a55
|
Provenance
The following attestation bundles were made for polars_statistics-0.2.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
publish.yml on DataZooDE/polars-statistics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_statistics-0.2.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
c057f5e85ac3f842054e7ee01f02eb6aa2c7dfabfc99d6e016fe58b4cdbc1ba0 - Sigstore transparency entry: 761629847
- Sigstore integration time:
-
Permalink:
DataZooDE/polars-statistics@33804b4bcecd7573e669d9504bee334ce708962e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/DataZooDE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@33804b4bcecd7573e669d9504bee334ce708962e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file polars_statistics-0.2.0-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: polars_statistics-0.2.0-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 5.5 MB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3086a4bd9a81d05c3366568ee9c5575e4e766cc3a1e780b6257a50bc7a7529e7
|
|
| MD5 |
dfdff4bb21f83c56428a7eac1ebb4468
|
|
| BLAKE2b-256 |
c45058e55996f0db87a9f3d89fe2b367bfea7bc4704e4c0177e7b55658585b11
|
Provenance
The following attestation bundles were made for polars_statistics-0.2.0-cp39-abi3-macosx_11_0_arm64.whl:
Publisher:
publish.yml on DataZooDE/polars-statistics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_statistics-0.2.0-cp39-abi3-macosx_11_0_arm64.whl -
Subject digest:
3086a4bd9a81d05c3366568ee9c5575e4e766cc3a1e780b6257a50bc7a7529e7 - Sigstore transparency entry: 761629715
- Sigstore integration time:
-
Permalink:
DataZooDE/polars-statistics@33804b4bcecd7573e669d9504bee334ce708962e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/DataZooDE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@33804b4bcecd7573e669d9504bee334ce708962e -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file polars_statistics-0.2.0-cp39-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: polars_statistics-0.2.0-cp39-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 6.1 MB
- Tags: CPython 3.9+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d74e3c07515cebcb63f0904db1d003deeeaa46c7dadf910ffd93b1989d847c19
|
|
| MD5 |
fb4f803e27cc022fc95a8e1d23156ff2
|
|
| BLAKE2b-256 |
138188f803f1da0043570160b48faac31a6fef6e03f2657399ac5eda315628fb
|
Provenance
The following attestation bundles were made for polars_statistics-0.2.0-cp39-abi3-macosx_10_12_x86_64.whl:
Publisher:
publish.yml on DataZooDE/polars-statistics
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polars_statistics-0.2.0-cp39-abi3-macosx_10_12_x86_64.whl -
Subject digest:
d74e3c07515cebcb63f0904db1d003deeeaa46c7dadf910ffd93b1989d847c19 - Sigstore transparency entry: 761629772
- Sigstore integration time:
-
Permalink:
DataZooDE/polars-statistics@33804b4bcecd7573e669d9504bee334ce708962e -
Branch / Tag:
refs/heads/main - Owner: https://github.com/DataZooDE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@33804b4bcecd7573e669d9504bee334ce708962e -
Trigger Event:
workflow_dispatch
-
Statement type: