Ultra-fast Rust-powered statistics and time-series utilities for Python.

Project description

💥 bunker-stats

A Rust powered statistical toolkit with a Python API and pandas Styler integration.

🔧 Overview

bunker-stats is a hybrid Rust and Python library providing:

Fast statistical primitives
Rolling window analytics
Distribution tools
pandas Styler visualizations

Everything runs on Rust for speed and correctness.

🧭 Project Philosophy and Status

v0.1 is an intentional early release.

This library focuses on correctness, clean APIs, and solid statistical foundations.

🔮 Future Focus

Performance tuning (SIMD, fused loops, BLAS ops)
Smarter rolling window engines
More visualization helpers
NaN safe variants
Multi column Rust kernels
Faster correlation matrix engine

🚀 Features

Core statistics (Rust)

Mean, variance, standard deviation
Sample vs population versions
Z scores
MAD
Percentiles and quantiles
IQR and Tukey fences
Covariance, correlation
Welford one pass algorithms
EWMA

Rolling analytics

Rolling mean, std, z score
Rolling covariance, correlation
Planned fused pipelines

Distribution tools

ECDF
Gaussian KDE
Quantile binning
Winsorization

Transforms

Robust scaling using Median and MAD
diff, pct_change, cumsum, cummean

pandas Styler

demean_style(df, column)
zscore_style(df, column, threshold=...)
iqr_outlier_style(df, column)
corr_heatmap(df)
robust_scale_column(df, column)

Function	Bunker-stats syntax	NumPy equivalent	pandas equivalent	Unique feature in `bunker-stats`
`mean`	`bs.mean(x)`	`np.mean(x)`	`s.mean()`	1D mean helper; always treats input as 1D numeric, thin Rust-backed wrapper.
`mean_skipna`	`bs.mean_skipna(x)`	`np.nanmean(x)` / manual mask	`s.mean(skipna=True)`	NaN-aware mean with explicit “skipna” semantics, matching pandas mental model.
`var`	`bs.var(x)`	`np.var(x, ddof=1)`	`s.var(ddof=1)`	1D sample variance (`ddof=1`) by default; matches stats textbooks.
`var_skipna`	`bs.var_skipna(x)`	`np.nanvar(x, ddof=1)` / mask	`s.var(skipna=True, ddof=1)`	NaN-aware sample variance in one call.
`std`	`bs.std(x)`	`np.std(x, ddof=1)`	`s.std(ddof=1)`	1D sample std with fixed `ddof=1`, consistent with `var`.
`std_skipna`	`bs.std_skipna(x)`	`np.nanstd(x, ddof=1)` / mask	`s.std(skipna=True, ddof=1)`	NaN-aware sample std; avoids writing masks every time.
`percentile`	`bs.percentile(x, q=0.95)`	`np.quantile(x, 0.95)` / `np.percentile`	`np.quantile(s, 0.95)`	Clean 1D percentile with your interpolation; integrated with other robust stats.
`mad`	`bs.mad(x)`	manual median/MAD	custom or `s.mad()` (mean abs dev, not median)	True median absolute deviation used by `robust_scale`.
`iqr`	`q1, q3, iqr = bs.iqr(x)`	`scipy.stats.iqr(x, rng=(25,75))`	`s.quantile([0.25, 0.75])`	Returns `(q1, q3, iqr)` in one go; no juggling multiple calls / indices.
`mean_axis`	`bs.mean_axis(X, axis=0, skipna=False)`	`np.mean(X, axis=0)`	`df.mean(axis=0, skipna=...)`	Axis-wise mean for 1D/2D arrays with optional `skipna`.
`var_axis`	`bs.var_axis(X, axis=1, skipna=True)`	`np.var(X, axis=1, ddof=1)` (no native skipna)	`df.var(axis=1, skipna=...)`	Axis-wise sample variance with built-in NaN handling.
`std_axis`	`bs.std_axis(X, axis=1, skipna=True)`	`np.std(X, axis=1, ddof=1)` (no native skipna)	`df.std(axis=1, skipna=...)`	Axis-wise sample std + `skipna`; aligns pandas mental model with NumPy arrays.
`mean_last_axis`*	`bs.mean_last_axis(X)` (if exposed)	`np.mean(X, axis=-1)`	`df.to_numpy().mean(axis=-1)`	N-D mean over last axis, consistent with your N-D rolling API.
`rolling_mean_last_axis`	`bs.rolling_mean_last_axis(X, window=3)`	manual reshape + loop / `np.apply_along_axis`	no built-in; need groupby+apply / custom logic	Shape-preserving N-D rolling mean over last axis (e.g. `(batch, feat, time)`).
`rolling_std_last_axis`	`bs.rolling_std_last_axis(X, window=3)`	same as above	same	N-D rolling std over last axis; perfect for batched time-series / ML tensors.
`rolling_mean`	`bs.rolling_mean(x, window=5)`	manual loop or `np.convolve` trick	`s.rolling(5).mean()`	Fast 1D rolling mean (truncated length) with no index overhead.
`rolling_std`	`bs.rolling_std(x, window=5)`	manual loop	`s.rolling(5).std()`	1D rolling std at Rust speed, sample variance convention.
`rolling_zscore`	`bs.rolling_zscore(x, window=20)`	manual window loop	`s.rolling(20).apply(custom)`	Rolling z-score in a single function; avoids `apply`/UDF overhead.
`ewma`	`bs.ewma(x, alpha=0.1)`	manual recurrence	`s.ewm(alpha=0.1).mean()`	Minimal EWMA for pure numeric arrays, no pandas object overhead.
`df_rolling_mean`	`bs.df_rolling_mean(df, window=5)`	`np.convolve` per column	`df.rolling(5).mean()`	DataFrame in / out, but columns powered by Rust rolling mean.
`df_rolling_std`	`bs.df_rolling_std(df, window=5)`	manual per-column	`df.rolling(5).std()`	Same for std; uses your rolling core but preserves pandas index.
`df_ewma`	`bs.df_ewma(df, alpha=0.1)`	manual per-column EWMA	`df.ewm(alpha=0.1).mean()`	Per-column EWMA with Rust engine, lighter than full pandas EWM machinery.
`col_mean`	`bs.col_mean(df, skipna=True)`	`np.mean(df.to_numpy(), axis=0)`	`df.mean(axis=0, skipna=True)`	Column-wise mean; internally uses `mean_axis` + `skipna`, returns labeled Series.
`row_mean`	`bs.row_mean(df, skipna=True)`	`np.mean(df.to_numpy(), axis=1)`	`df.mean(axis=1, skipna=True)`	Row-wise mean with Rust numeric core + pandas index.
`cov_df`	`bs.cov_df(df)`	`np.cov(df.to_numpy().T, ddof=1)`	`df.cov()`	Full covariance matrix via Rust `cov_matrix`, but returned as a DataFrame.
`corr_df`	`bs.corr_df(df)`	`np.corrcoef(df.to_numpy().T)`	`df.corr()`	Correlation matrix backed by your Rust correlation engine.
`rolling_mean_series`	`bs.rolling_mean_series(s, window=10)`	manual 1D loop	`s.rolling(10).mean()`	Series-in / Series-out convenience wrapper around Rust rolling mean.
`rolling_std_series`	`bs.rolling_std_series(s, window=10)`	manual 1D loop	`s.rolling(10).std()`	Same for std; keeps index alignment, uses Rust core.
`iqr_outliers`	`bs.iqr_outliers(x, k=1.5)`	`iqr = scipy.stats.iqr(x); mask = ...`	quantiles + boolean mask	Returns a boolean outlier mask in one call using IQR rule.
`zscore_outliers`	`bs.zscore_outliers(x, threshold=3.0)`	`(np.abs((x-x.mean())/x.std()) > 3)`	same logic on `Series`	One-liner z-score outlier mask; integrates with your `mean`/`std` semantics.
`minmax_scale`	`scaled, mn, mx = bs.minmax_scale(x)`	manual `(x-mn)/(mx-mn)`	use `MinMaxScaler` from sklearn	Returns both scaled data and the `(min, max)` used (for inverse-transform/reuse).
`robust_scale`	`scaled, med, mad = bs.robust_scale(x, scale_factor)`	manual MAD calculation	`RobustScaler` or custom	All-in-one robust scaling with returned `(median, MAD)`; pairs with your `mad`.
`winsorize`	`bs.winsorize(x, lower_q=0.05, upper_q=0.95)`	`scipy.stats.mstats.winsorize(x, limits=...)`	custom quantile clipping	1D winsorization in Rust, single call returning a full adjusted array.
`diff`	`bs.diff(x, periods=1)`	`np.diff(x, n=1)` (shorter) / manual padding	`s.diff(periods=1)`	Full-length diff with NaNs where necessary; supports negative `periods`.
`pct_change`	`bs.pct_change(x, periods=1)`	manual `(x[i]-x[i-p]) / x[i-p]`	`s.pct_change(periods=1)`	Includes divide-by-zero → NaN handling; symmetric for positive/negative lags.
`cumsum`	`bs.cumsum(x)`	`np.cumsum(x)`	`s.cumsum()`	Rust implementation; value is performance on large 1D arrays.
`cummean`	`bs.cummean(x)`	`np.cumsum(x)/np.arange(1,len(x)+1)`	`s.expanding().mean()`	Streaming cumulative mean without constructing expanding windows.
`ecdf`	`vals, probs = bs.ecdf(x)`	manual sort + rank	custom `rank`/`value_counts`	Returns sorted values + CDF in one go; perfect for ECDF plots.
`quantile_bins`	`bins = bs.quantile_bins(x, n_bins=10)`	manual rank + binning	`pd.qcut(x, q=10)` (Categorical)	Returns plain integer bin labels `0..n_bins-1` as a NumPy array (ML-friendly).
`sign_mask`	`mask = bs.sign_mask(x)`	`np.sign(x).astype(np.int8)`	`(s > 0) - (s < 0)`	Encodes sign into `{-1, 0, 1}`; useful for discrete signal features.
`demean_with_signs`	`demeaned, signs = bs.demean_with_signs(x)`	`(x - x.mean(), np.sign(x - x.mean()))`	custom	Returns both demeaned data and sign mask in one pass.
`cov`	`bs.cov(x, y)`	`np.cov(x, y, ddof=1)[0,1]`	`s1.cov(s2)`	1D sample covariance as a simple scalar function.
`corr`	`bs.corr(x, y)`	`np.corrcoef(x, y)[0,1]`	`s1.corr(s2)`	1D Pearson correlation using your var/std core.
`cov_skipna`	`bs.cov_skipna(x, y)`	manual pairwise dropna + `np.cov`	`s1.cov(s2)` with aligned/dropna	Pairwise NaN dropping built in for 1D covariance.
`corr_skipna`	`bs.corr_skipna(x, y)`	manual pairwise dropna + `np.corrcoef`	`s1.corr(s2)` with dropna	Same but for correlation; hides the messy mask-bookkeeping.
`cov_matrix`	`bs.cov_matrix(X)`	`np.cov(X, rowvar=False, ddof=1)`	`df.cov()`	Symmetric covariance matrix with Rust loops; tuned for tabular X.
`corr_matrix`	`bs.corr_matrix(X)`	`np.corrcoef(X, rowvar=False)`	`df.corr()`	Correlation matrix built on your cov/std stack; consistent behaviour across code paths.
`rolling_cov`	`bs.rolling_cov(x, y, window=50)`	manual sliding window + `np.cov`	`df['x'].rolling(50).cov(df['y'])`	Rolling 1D covariance without pandas overhead; good for streaming stats.
`rolling_corr`	`bs.rolling_corr(x, y, window=50)`	manual sliding window + `np.corrcoef`	`df['x'].rolling(50).corr(df['y'])`	Rolling 1D correlation in one Rust call; no custom loop needed in Python.
`kde_gaussian`	`grid, dens = bs.kde_gaussian(x, n_points=256)`	`scipy.stats.gaussian_kde(x)` + evaluation	no direct builtin (need SciPy)	Lightweight 1D Gaussian KDE; returns `(grid, density)` using a simple bandwidth rule by default.

📦 Installation

git clone https://github.com/bunker-stats.git
cd bunker-stats

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate

pip install maturin
maturin develop

Project details

Release history Release notifications | RSS feed

0.2.9

Jan 24, 2026

0.2.8

Jan 6, 2026

0.2.7

Dec 31, 2025

0.2.5

Dec 25, 2025

0.2.4

Dec 25, 2025

0.2.3

Dec 8, 2025

0.2.2

Dec 7, 2025

0.2.1

Dec 6, 2025

This version

0.2a0 pre-release

Dec 8, 2025

0.1.0

Nov 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bunker_stats_rs-0.2a0-cp310-cp310-win_amd64.whl (168.0 kB view details)

Uploaded Dec 8, 2025 CPython 3.10Windows x86-64

File details

Details for the file bunker_stats_rs-0.2a0-cp310-cp310-win_amd64.whl.

File metadata

Download URL: bunker_stats_rs-0.2a0-cp310-cp310-win_amd64.whl
Upload date: Dec 8, 2025
Size: 168.0 kB
Tags: CPython 3.10, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.10.2

File hashes

Hashes for bunker_stats_rs-0.2a0-cp310-cp310-win_amd64.whl
Algorithm	Hash digest
SHA256	`63875022226cfc92f1e0e2f4340f9b6f7b93e75e96a4d5c8ccda225e6c2a4e2b`
MD5	`396427a138a979641362f78f9d8c9f08`
BLAKE2b-256	`bb73237c548ef93a5b40b654659eed4e0827eb091503ceb58667801a8269951d`

See more details on using hashes here.

bunker-stats-rs 0.2a0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers