Ultra-fast Rust-powered statistics and time-series utilities for Python.
Project description
💥 bunker-stats
A Rust powered statistical toolkit with a Python API and pandas Styler integration.
🔧 Overview
bunker-stats is a hybrid Rust and Python library providing:
- Fast statistical primitives
- Rolling window analytics
- Distribution tools
- pandas Styler visualizations
Everything runs on Rust for speed and correctness.
🧭 Project Philosophy and Status
v0.1 is an intentional early release.
This library focuses on correctness, clean APIs, and solid statistical foundations.
🔮 Future Focus
- Performance tuning (SIMD, fused loops, BLAS ops)
- Smarter rolling window engines
- More visualization helpers
- NaN safe variants
- Multi column Rust kernels
- Faster correlation matrix engine
🚀 Features
Core statistics (Rust)
- Mean, variance, standard deviation
- Sample vs population versions
- Z scores
- MAD
- Percentiles and quantiles
- IQR and Tukey fences
- Covariance, correlation
- Welford one pass algorithms
- EWMA
Rolling analytics
- Rolling mean, std, z score
- Rolling covariance, correlation
- Planned fused pipelines
Distribution tools
- ECDF
- Gaussian KDE
- Quantile binning
- Winsorization
Transforms
- Robust scaling using Median and MAD
- diff, pct_change, cumsum, cummean
pandas Styler
demean_style(df, column)zscore_style(df, column, threshold=...)iqr_outlier_style(df, column)corr_heatmap(df)robust_scale_column(df, column)
| Function | Bunker-stats syntax | NumPy equivalent | pandas equivalent | Unique feature in bunker-stats |
|---|---|---|---|---|
mean |
bs.mean(x) |
np.mean(x) |
s.mean() |
1D mean helper; always treats input as 1D numeric, thin Rust-backed wrapper. |
mean_skipna |
bs.mean_skipna(x) |
np.nanmean(x) / manual mask |
s.mean(skipna=True) |
NaN-aware mean with explicit “skipna” semantics, matching pandas mental model. |
var |
bs.var(x) |
np.var(x, ddof=1) |
s.var(ddof=1) |
1D sample variance (ddof=1) by default; matches stats textbooks. |
var_skipna |
bs.var_skipna(x) |
np.nanvar(x, ddof=1) / mask |
s.var(skipna=True, ddof=1) |
NaN-aware sample variance in one call. |
std |
bs.std(x) |
np.std(x, ddof=1) |
s.std(ddof=1) |
1D sample std with fixed ddof=1, consistent with var. |
std_skipna |
bs.std_skipna(x) |
np.nanstd(x, ddof=1) / mask |
s.std(skipna=True, ddof=1) |
NaN-aware sample std; avoids writing masks every time. |
percentile |
bs.percentile(x, q=0.95) |
np.quantile(x, 0.95) / np.percentile |
np.quantile(s, 0.95) |
Clean 1D percentile with your interpolation; integrated with other robust stats. |
mad |
bs.mad(x) |
manual median/MAD | custom or s.mad() (mean abs dev, not median) |
True median absolute deviation used by robust_scale. |
iqr |
q1, q3, iqr = bs.iqr(x) |
scipy.stats.iqr(x, rng=(25,75)) |
s.quantile([0.25, 0.75]) |
Returns (q1, q3, iqr) in one go; no juggling multiple calls / indices. |
mean_axis |
bs.mean_axis(X, axis=0, skipna=False) |
np.mean(X, axis=0) |
df.mean(axis=0, skipna=...) |
Axis-wise mean for 1D/2D arrays with optional skipna. |
var_axis |
bs.var_axis(X, axis=1, skipna=True) |
np.var(X, axis=1, ddof=1) (no native skipna) |
df.var(axis=1, skipna=...) |
Axis-wise sample variance with built-in NaN handling. |
std_axis |
bs.std_axis(X, axis=1, skipna=True) |
np.std(X, axis=1, ddof=1) (no native skipna) |
df.std(axis=1, skipna=...) |
Axis-wise sample std + skipna; aligns pandas mental model with NumPy arrays. |
mean_last_axis* |
bs.mean_last_axis(X) (if exposed) |
np.mean(X, axis=-1) |
df.to_numpy().mean(axis=-1) |
N-D mean over last axis, consistent with your N-D rolling API. |
rolling_mean_last_axis |
bs.rolling_mean_last_axis(X, window=3) |
manual reshape + loop / np.apply_along_axis |
no built-in; need groupby+apply / custom logic | Shape-preserving N-D rolling mean over last axis (e.g. (batch, feat, time)). |
rolling_std_last_axis |
bs.rolling_std_last_axis(X, window=3) |
same as above | same | N-D rolling std over last axis; perfect for batched time-series / ML tensors. |
rolling_mean |
bs.rolling_mean(x, window=5) |
manual loop or np.convolve trick |
s.rolling(5).mean() |
Fast 1D rolling mean (truncated length) with no index overhead. |
rolling_std |
bs.rolling_std(x, window=5) |
manual loop | s.rolling(5).std() |
1D rolling std at Rust speed, sample variance convention. |
rolling_zscore |
bs.rolling_zscore(x, window=20) |
manual window loop | s.rolling(20).apply(custom) |
Rolling z-score in a single function; avoids apply/UDF overhead. |
ewma |
bs.ewma(x, alpha=0.1) |
manual recurrence | s.ewm(alpha=0.1).mean() |
Minimal EWMA for pure numeric arrays, no pandas object overhead. |
df_rolling_mean |
bs.df_rolling_mean(df, window=5) |
np.convolve per column |
df.rolling(5).mean() |
DataFrame in / out, but columns powered by Rust rolling mean. |
df_rolling_std |
bs.df_rolling_std(df, window=5) |
manual per-column | df.rolling(5).std() |
Same for std; uses your rolling core but preserves pandas index. |
df_ewma |
bs.df_ewma(df, alpha=0.1) |
manual per-column EWMA | df.ewm(alpha=0.1).mean() |
Per-column EWMA with Rust engine, lighter than full pandas EWM machinery. |
col_mean |
bs.col_mean(df, skipna=True) |
np.mean(df.to_numpy(), axis=0) |
df.mean(axis=0, skipna=True) |
Column-wise mean; internally uses mean_axis + skipna, returns labeled Series. |
row_mean |
bs.row_mean(df, skipna=True) |
np.mean(df.to_numpy(), axis=1) |
df.mean(axis=1, skipna=True) |
Row-wise mean with Rust numeric core + pandas index. |
cov_df |
bs.cov_df(df) |
np.cov(df.to_numpy().T, ddof=1) |
df.cov() |
Full covariance matrix via Rust cov_matrix, but returned as a DataFrame. |
corr_df |
bs.corr_df(df) |
np.corrcoef(df.to_numpy().T) |
df.corr() |
Correlation matrix backed by your Rust correlation engine. |
rolling_mean_series |
bs.rolling_mean_series(s, window=10) |
manual 1D loop | s.rolling(10).mean() |
Series-in / Series-out convenience wrapper around Rust rolling mean. |
rolling_std_series |
bs.rolling_std_series(s, window=10) |
manual 1D loop | s.rolling(10).std() |
Same for std; keeps index alignment, uses Rust core. |
iqr_outliers |
bs.iqr_outliers(x, k=1.5) |
iqr = scipy.stats.iqr(x); mask = ... |
quantiles + boolean mask | Returns a boolean outlier mask in one call using IQR rule. |
zscore_outliers |
bs.zscore_outliers(x, threshold=3.0) |
(np.abs((x-x.mean())/x.std()) > 3) |
same logic on Series |
One-liner z-score outlier mask; integrates with your mean/std semantics. |
minmax_scale |
scaled, mn, mx = bs.minmax_scale(x) |
manual (x-mn)/(mx-mn) |
use MinMaxScaler from sklearn |
Returns both scaled data and the (min, max) used (for inverse-transform/reuse). |
robust_scale |
scaled, med, mad = bs.robust_scale(x, scale_factor) |
manual MAD calculation | RobustScaler or custom |
All-in-one robust scaling with returned (median, MAD); pairs with your mad. |
winsorize |
bs.winsorize(x, lower_q=0.05, upper_q=0.95) |
scipy.stats.mstats.winsorize(x, limits=...) |
custom quantile clipping | 1D winsorization in Rust, single call returning a full adjusted array. |
diff |
bs.diff(x, periods=1) |
np.diff(x, n=1) (shorter) / manual padding |
s.diff(periods=1) |
Full-length diff with NaNs where necessary; supports negative periods. |
pct_change |
bs.pct_change(x, periods=1) |
manual (x[i]-x[i-p]) / x[i-p] |
s.pct_change(periods=1) |
Includes divide-by-zero → NaN handling; symmetric for positive/negative lags. |
cumsum |
bs.cumsum(x) |
np.cumsum(x) |
s.cumsum() |
Rust implementation; value is performance on large 1D arrays. |
cummean |
bs.cummean(x) |
np.cumsum(x)/np.arange(1,len(x)+1) |
s.expanding().mean() |
Streaming cumulative mean without constructing expanding windows. |
ecdf |
vals, probs = bs.ecdf(x) |
manual sort + rank | custom rank/value_counts |
Returns sorted values + CDF in one go; perfect for ECDF plots. |
quantile_bins |
bins = bs.quantile_bins(x, n_bins=10) |
manual rank + binning | pd.qcut(x, q=10) (Categorical) |
Returns plain integer bin labels 0..n_bins-1 as a NumPy array (ML-friendly). |
sign_mask |
mask = bs.sign_mask(x) |
np.sign(x).astype(np.int8) |
(s > 0) - (s < 0) |
Encodes sign into {-1, 0, 1}; useful for discrete signal features. |
demean_with_signs |
demeaned, signs = bs.demean_with_signs(x) |
(x - x.mean(), np.sign(x - x.mean())) |
custom | Returns both demeaned data and sign mask in one pass. |
cov |
bs.cov(x, y) |
np.cov(x, y, ddof=1)[0,1] |
s1.cov(s2) |
1D sample covariance as a simple scalar function. |
corr |
bs.corr(x, y) |
np.corrcoef(x, y)[0,1] |
s1.corr(s2) |
1D Pearson correlation using your var/std core. |
cov_skipna |
bs.cov_skipna(x, y) |
manual pairwise dropna + np.cov |
s1.cov(s2) with aligned/dropna |
Pairwise NaN dropping built in for 1D covariance. |
corr_skipna |
bs.corr_skipna(x, y) |
manual pairwise dropna + np.corrcoef |
s1.corr(s2) with dropna |
Same but for correlation; hides the messy mask-bookkeeping. |
cov_matrix |
bs.cov_matrix(X) |
np.cov(X, rowvar=False, ddof=1) |
df.cov() |
Symmetric covariance matrix with Rust loops; tuned for tabular X. |
corr_matrix |
bs.corr_matrix(X) |
np.corrcoef(X, rowvar=False) |
df.corr() |
Correlation matrix built on your cov/std stack; consistent behaviour across code paths. |
rolling_cov |
bs.rolling_cov(x, y, window=50) |
manual sliding window + np.cov |
df['x'].rolling(50).cov(df['y']) |
Rolling 1D covariance without pandas overhead; good for streaming stats. |
rolling_corr |
bs.rolling_corr(x, y, window=50) |
manual sliding window + np.corrcoef |
df['x'].rolling(50).corr(df['y']) |
Rolling 1D correlation in one Rust call; no custom loop needed in Python. |
kde_gaussian |
grid, dens = bs.kde_gaussian(x, n_points=256) |
scipy.stats.gaussian_kde(x) + evaluation |
no direct builtin (need SciPy) | Lightweight 1D Gaussian KDE; returns (grid, density) using a simple bandwidth rule by default. |
📦 Installation
git clone https://github.com/bunker-stats.git
cd bunker-stats
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install maturin
maturin develop
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bunker_stats_rs-0.2.5.tar.gz.
File metadata
- Download URL: bunker_stats_rs-0.2.5.tar.gz
- Upload date:
- Size: 67.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
57f862d8a594a4796ec29e1eab8d4229da824ab8966a234ae53aa829e2903bbf
|
|
| MD5 |
b8934b1b0dc77d39433f1974413f64f0
|
|
| BLAKE2b-256 |
18fe8f9188cc65e295eb0b11e4299328b5df26e0aaf9c5ec8e3c46b6f8de8578
|
File details
Details for the file bunker_stats_rs-0.2.5-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: bunker_stats_rs-0.2.5-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 223.8 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a7937fa2f78d04136e6695cece458965fddd3501e84d8bba00ea3b451e1b1c7
|
|
| MD5 |
8fe7c9296fcda4a31fdbc0a966b5d297
|
|
| BLAKE2b-256 |
946df1e7f7b08160004009a64ec1d4ea302698ec0c358e25e3e02f5215233c96
|