Skip to main content

Ultra-fast Rust-powered statistics and time-series utilities for Python.

Project description

.

🚀 bunker-stats-rs

Ultra-fast Rust-powered statistics + time-series utilities for Python. Designed for data scientists, quants, researchers, analysts, and ML engineers who need NumPy-compatible accuracy with massive speedups on rolling statistics, covariance/correlation, outlier detection, ECDF, KDE, and more.

Goal: A lightweight, zero-dependency, high-performance alternative to many NumPy / Pandas / SciPy statistical operations — with predictable performance on large arrays.

📦 Installation pip install bunker-stats-rs

⚡️ Why bunker-stats?

Pure Rust kernels

No Python loops

No Pandas overhead

Predictable vectorized performance

Identical numerical results (within fp tolerance)

Minimal dependencies

Up to 1700× faster depending on the operation

Built for large 1D/2D NumPy arrays

🔥 Benchmark Summary

Benchmarks run on: Windows 10 • Intel i7 • Python 3.10 • NumPy 1.26 • Pandas 2.2 Dataset sizes: 1,000,000-element 1D arrays and 200,000×10 2D matrices

Below is a curated “top wins” summary:

Top Speedups (reference_time / bunker_time) Group Operation Ref Backend Ref Time (ms) Bunker (ms) Speedup Allclose Max Diff rolling rolling_zscore python_ref 33934.42 19.49 ×1741.47 True 4.12e-11 diff_cum_etc cummean python_ref 297.37 2.35 ×126.72 True 0.0 rolling ewma numpy_ref 376.98 4.85 ×77.79 True 0.0 diff_cum_etc sign_mask python_ref 14.62 0.60 ×24.34 True 0.0 cov_corr rolling_cov pandas 157.25 14.06 ×11.18 True 4.48e-14 rolling rolling_mean pandas 54.68 5.14 ×10.63 True 7.99e-15 cov_corr cov_pair numpy 15.27 4.08 ×3.74 True 3.03e-18 outliers zscore_outliers python_ref 16.03 4.60 ×3.48 True 0.0 diff_cum_etc quantile_bins_10 pandas 82.57 44.68 ×1.85 True 0.0 scipy_compare iqr_scipy scipy 35.82 15.90 ×2.25 True 0.0

Full benchmark results are available in /benchmarks.

🧩 Features Basic Stats

mean / std / var (ddof=1)

percentiles

IQR, MAD

min-max scaling

robust scaling (median/MAD)

winsorizing

Rolling Windows

rolling mean

rolling std

rolling zscore (z of last element)

EWMA (exponential smoothing)

Diff / Cumulative Operations

diff

pct_change

cumsum

cummean

ECDF

quantile binning

sign masks

demean with sign mask

Covariance & Correlation

covariance (pair)

correlation (pair)

covariance matrix

correlation matrix

rolling covariance

rolling correlation

KDE (Kernel Density Estimate)

Fast Gaussian KDE

📌 Examples import numpy as np import bunker_stats_rs as bs

x = np.random.randn(1_000_000)

Fast std

s = bs.std_np(x)

Rolling mean

r = bs.rolling_mean_np(x, window=50)

Covariance

cov = bs.cov_np(x, x * 2.0 + 1.0)

ECDF

vals, cdf = bs.ecdf_np(x)

🧱 Design Goals

Be a surgical, ultra-fast replacement for statistical hot paths in Python workflows

Work directly with NumPy arrays (input/output stays NumPy)

Zero hidden state, deterministic execution

Predictable performance across large inputs

Low-level but ergonomic API

⚠️ Limitations (v0.1.0)

float64 only

1D and 2D arrays only

No nan* functions yet (nanmean, nanstd, nanpercentile)

Rolling windows do not skip NaNs

Percentile + KDE slower than NumPy/SciPy on small arrays

Not a drop-in replacement for pandas — focuses on raw NumPy data

These will improve in future releases.

🗺 Roadmap v0.2 — NaN-Aware API

nanmean / nanstd / nanvar

nanpercentile

NaN-friendly rolling windows

v0.3 — 2D Rolling Stats

rolling mean/std/cov/corr for matrices

v0.4 — Parallelism

Optional Rayon parallel kernels for 50M+ elements

v0.5 — sklearn-like Transformers

Scaling transformers

Outlier detectors

Binning transformers

🧪 Running Benchmarks cd benchmarks python bench_all.py

📜 License

This project is licensed under the MIT License. See the LICENSE file for details.

🤝 Contributing

PRs welcome — especially for:

new statistical kernels

rolling ops

SciPy parity

tests + benchmarks

performance improvements

⭐️ Support

If this library speeds up your workflow, please ⭐ the repo!

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bunker_stats_rs-0.1.0.tar.gz (92.4 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bunker_stats_rs-0.1.0-cp310-cp310-win_amd64.whl (136.0 kB view details)

Uploaded CPython 3.10Windows x86-64

File details

Details for the file bunker_stats_rs-0.1.0.tar.gz.

File metadata

  • Download URL: bunker_stats_rs-0.1.0.tar.gz
  • Upload date:
  • Size: 92.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.10.2

File hashes

Hashes for bunker_stats_rs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 47966bbbc01985fa34fed20257515b287c99337de498f0cbf9a448672fb27fe7
MD5 7543009757e754c1038c155cf26ef2b0
BLAKE2b-256 114ac49dcae5b66bce64e4628240e34028fc2fabedc01a0dc57f7060c3824ea5

See more details on using hashes here.

File details

Details for the file bunker_stats_rs-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for bunker_stats_rs-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2c2ab39049dc338864c6a5083a71ca2ab02ca5eb94a8c499be964d948adf2b3b
MD5 0fa43bac799ace672033f6d5e10b793a
BLAKE2b-256 c19b6a0bf53b0bf953b9821b7f798c71bf3eb90a1c00369e8af07b798b94d515

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page