Skip to main content

Ultra-fast Rust-powered statistics and time-series utilities for Python.

Project description

💥 bunker-stats

A Rust powered statistical toolkit with a Python API and pandas Styler integration.


🔧 Overview

bunker-stats is a hybrid Rust and Python library providing:

  • Fast statistical primitives
  • Rolling window analytics
  • Distribution tools
  • pandas Styler visualizations

Everything runs on Rust for speed and correctness.


🧭 Project Philosophy and Status

v0.1 is an intentional early release.

This library focuses on correctness, clean APIs, and solid statistical foundations.

🔮 Future Focus

  • Performance tuning (SIMD, fused loops, BLAS ops)
  • Smarter rolling window engines
  • More visualization helpers
  • NaN safe variants
  • Multi column Rust kernels
  • Faster correlation matrix engine

🚀 Features

Core statistics (Rust)

  • Mean, variance, standard deviation
  • Sample vs population versions
  • Z scores
  • MAD
  • Percentiles and quantiles
  • IQR and Tukey fences
  • Covariance, correlation
  • Welford one pass algorithms
  • EWMA

Rolling analytics

  • Rolling mean, std, z score
  • Rolling covariance, correlation
  • Planned fused pipelines

Distribution tools

  • ECDF
  • Gaussian KDE
  • Quantile binning
  • Winsorization

Transforms

  • Robust scaling using Median and MAD
  • diff, pct_change, cumsum, cummean

pandas Styler

  • demean_style(df, column)
  • zscore_style(df, column, threshold=...)
  • iqr_outlier_style(df, column)
  • corr_heatmap(df)
  • robust_scale_column(df, column)

Function Bunker-stats syntax NumPy equivalent pandas equivalent Unique feature in bunker-stats
mean bs.mean(x) np.mean(x) s.mean() 1D mean helper; always treats input as 1D numeric, thin Rust-backed wrapper.
mean_skipna bs.mean_skipna(x) np.nanmean(x) / manual mask s.mean(skipna=True) NaN-aware mean with explicit “skipna” semantics, matching pandas mental model.
var bs.var(x) np.var(x, ddof=1) s.var(ddof=1) 1D sample variance (ddof=1) by default; matches stats textbooks.
var_skipna bs.var_skipna(x) np.nanvar(x, ddof=1) / mask s.var(skipna=True, ddof=1) NaN-aware sample variance in one call.
std bs.std(x) np.std(x, ddof=1) s.std(ddof=1) 1D sample std with fixed ddof=1, consistent with var.
std_skipna bs.std_skipna(x) np.nanstd(x, ddof=1) / mask s.std(skipna=True, ddof=1) NaN-aware sample std; avoids writing masks every time.
percentile bs.percentile(x, q=0.95) np.quantile(x, 0.95) / np.percentile np.quantile(s, 0.95) Clean 1D percentile with your interpolation; integrated with other robust stats.
mad bs.mad(x) manual median/MAD custom or s.mad() (mean abs dev, not median) True median absolute deviation used by robust_scale.
iqr q1, q3, iqr = bs.iqr(x) scipy.stats.iqr(x, rng=(25,75)) s.quantile([0.25, 0.75]) Returns (q1, q3, iqr) in one go; no juggling multiple calls / indices.
mean_axis bs.mean_axis(X, axis=0, skipna=False) np.mean(X, axis=0) df.mean(axis=0, skipna=...) Axis-wise mean for 1D/2D arrays with optional skipna.
var_axis bs.var_axis(X, axis=1, skipna=True) np.var(X, axis=1, ddof=1) (no native skipna) df.var(axis=1, skipna=...) Axis-wise sample variance with built-in NaN handling.
std_axis bs.std_axis(X, axis=1, skipna=True) np.std(X, axis=1, ddof=1) (no native skipna) df.std(axis=1, skipna=...) Axis-wise sample std + skipna; aligns pandas mental model with NumPy arrays.
mean_last_axis* bs.mean_last_axis(X) (if exposed) np.mean(X, axis=-1) df.to_numpy().mean(axis=-1) N-D mean over last axis, consistent with your N-D rolling API.
rolling_mean_last_axis bs.rolling_mean_last_axis(X, window=3) manual reshape + loop / np.apply_along_axis no built-in; need groupby+apply / custom logic Shape-preserving N-D rolling mean over last axis (e.g. (batch, feat, time)).
rolling_std_last_axis bs.rolling_std_last_axis(X, window=3) same as above same N-D rolling std over last axis; perfect for batched time-series / ML tensors.
rolling_mean bs.rolling_mean(x, window=5) manual loop or np.convolve trick s.rolling(5).mean() Fast 1D rolling mean (truncated length) with no index overhead.
rolling_std bs.rolling_std(x, window=5) manual loop s.rolling(5).std() 1D rolling std at Rust speed, sample variance convention.
rolling_zscore bs.rolling_zscore(x, window=20) manual window loop s.rolling(20).apply(custom) Rolling z-score in a single function; avoids apply/UDF overhead.
ewma bs.ewma(x, alpha=0.1) manual recurrence s.ewm(alpha=0.1).mean() Minimal EWMA for pure numeric arrays, no pandas object overhead.
df_rolling_mean bs.df_rolling_mean(df, window=5) np.convolve per column df.rolling(5).mean() DataFrame in / out, but columns powered by Rust rolling mean.
df_rolling_std bs.df_rolling_std(df, window=5) manual per-column df.rolling(5).std() Same for std; uses your rolling core but preserves pandas index.
df_ewma bs.df_ewma(df, alpha=0.1) manual per-column EWMA df.ewm(alpha=0.1).mean() Per-column EWMA with Rust engine, lighter than full pandas EWM machinery.
col_mean bs.col_mean(df, skipna=True) np.mean(df.to_numpy(), axis=0) df.mean(axis=0, skipna=True) Column-wise mean; internally uses mean_axis + skipna, returns labeled Series.
row_mean bs.row_mean(df, skipna=True) np.mean(df.to_numpy(), axis=1) df.mean(axis=1, skipna=True) Row-wise mean with Rust numeric core + pandas index.
cov_df bs.cov_df(df) np.cov(df.to_numpy().T, ddof=1) df.cov() Full covariance matrix via Rust cov_matrix, but returned as a DataFrame.
corr_df bs.corr_df(df) np.corrcoef(df.to_numpy().T) df.corr() Correlation matrix backed by your Rust correlation engine.
rolling_mean_series bs.rolling_mean_series(s, window=10) manual 1D loop s.rolling(10).mean() Series-in / Series-out convenience wrapper around Rust rolling mean.
rolling_std_series bs.rolling_std_series(s, window=10) manual 1D loop s.rolling(10).std() Same for std; keeps index alignment, uses Rust core.
iqr_outliers bs.iqr_outliers(x, k=1.5) iqr = scipy.stats.iqr(x); mask = ... quantiles + boolean mask Returns a boolean outlier mask in one call using IQR rule.
zscore_outliers bs.zscore_outliers(x, threshold=3.0) (np.abs((x-x.mean())/x.std()) > 3) same logic on Series One-liner z-score outlier mask; integrates with your mean/std semantics.
minmax_scale scaled, mn, mx = bs.minmax_scale(x) manual (x-mn)/(mx-mn) use MinMaxScaler from sklearn Returns both scaled data and the (min, max) used (for inverse-transform/reuse).
robust_scale scaled, med, mad = bs.robust_scale(x, scale_factor) manual MAD calculation RobustScaler or custom All-in-one robust scaling with returned (median, MAD); pairs with your mad.
winsorize bs.winsorize(x, lower_q=0.05, upper_q=0.95) scipy.stats.mstats.winsorize(x, limits=...) custom quantile clipping 1D winsorization in Rust, single call returning a full adjusted array.
diff bs.diff(x, periods=1) np.diff(x, n=1) (shorter) / manual padding s.diff(periods=1) Full-length diff with NaNs where necessary; supports negative periods.
pct_change bs.pct_change(x, periods=1) manual (x[i]-x[i-p]) / x[i-p] s.pct_change(periods=1) Includes divide-by-zero → NaN handling; symmetric for positive/negative lags.
cumsum bs.cumsum(x) np.cumsum(x) s.cumsum() Rust implementation; value is performance on large 1D arrays.
cummean bs.cummean(x) np.cumsum(x)/np.arange(1,len(x)+1) s.expanding().mean() Streaming cumulative mean without constructing expanding windows.
ecdf vals, probs = bs.ecdf(x) manual sort + rank custom rank/value_counts Returns sorted values + CDF in one go; perfect for ECDF plots.
quantile_bins bins = bs.quantile_bins(x, n_bins=10) manual rank + binning pd.qcut(x, q=10) (Categorical) Returns plain integer bin labels 0..n_bins-1 as a NumPy array (ML-friendly).
sign_mask mask = bs.sign_mask(x) np.sign(x).astype(np.int8) (s > 0) - (s < 0) Encodes sign into {-1, 0, 1}; useful for discrete signal features.
demean_with_signs demeaned, signs = bs.demean_with_signs(x) (x - x.mean(), np.sign(x - x.mean())) custom Returns both demeaned data and sign mask in one pass.
cov bs.cov(x, y) np.cov(x, y, ddof=1)[0,1] s1.cov(s2) 1D sample covariance as a simple scalar function.
corr bs.corr(x, y) np.corrcoef(x, y)[0,1] s1.corr(s2) 1D Pearson correlation using your var/std core.
cov_skipna bs.cov_skipna(x, y) manual pairwise dropna + np.cov s1.cov(s2) with aligned/dropna Pairwise NaN dropping built in for 1D covariance.
corr_skipna bs.corr_skipna(x, y) manual pairwise dropna + np.corrcoef s1.corr(s2) with dropna Same but for correlation; hides the messy mask-bookkeeping.
cov_matrix bs.cov_matrix(X) np.cov(X, rowvar=False, ddof=1) df.cov() Symmetric covariance matrix with Rust loops; tuned for tabular X.
corr_matrix bs.corr_matrix(X) np.corrcoef(X, rowvar=False) df.corr() Correlation matrix built on your cov/std stack; consistent behaviour across code paths.
rolling_cov bs.rolling_cov(x, y, window=50) manual sliding window + np.cov df['x'].rolling(50).cov(df['y']) Rolling 1D covariance without pandas overhead; good for streaming stats.
rolling_corr bs.rolling_corr(x, y, window=50) manual sliding window + np.corrcoef df['x'].rolling(50).corr(df['y']) Rolling 1D correlation in one Rust call; no custom loop needed in Python.
kde_gaussian grid, dens = bs.kde_gaussian(x, n_points=256) scipy.stats.gaussian_kde(x) + evaluation no direct builtin (need SciPy) Lightweight 1D Gaussian KDE; returns (grid, density) using a simple bandwidth rule by default.

📦 Installation

git clone https://github.com/bunker-stats.git
cd bunker-stats

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate

pip install maturin
maturin develop

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bunker_stats_rs-0.2.5.tar.gz (67.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bunker_stats_rs-0.2.5-cp310-cp310-win_amd64.whl (223.8 kB view details)

Uploaded CPython 3.10Windows x86-64

File details

Details for the file bunker_stats_rs-0.2.5.tar.gz.

File metadata

  • Download URL: bunker_stats_rs-0.2.5.tar.gz
  • Upload date:
  • Size: 67.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.10.2

File hashes

Hashes for bunker_stats_rs-0.2.5.tar.gz
Algorithm Hash digest
SHA256 57f862d8a594a4796ec29e1eab8d4229da824ab8966a234ae53aa829e2903bbf
MD5 b8934b1b0dc77d39433f1974413f64f0
BLAKE2b-256 18fe8f9188cc65e295eb0b11e4299328b5df26e0aaf9c5ec8e3c46b6f8de8578

See more details on using hashes here.

File details

Details for the file bunker_stats_rs-0.2.5-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for bunker_stats_rs-0.2.5-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 4a7937fa2f78d04136e6695cece458965fddd3501e84d8bba00ea3b451e1b1c7
MD5 8fe7c9296fcda4a31fdbc0a966b5d297
BLAKE2b-256 946df1e7f7b08160004009a64ec1d4ea302698ec0c358e25e3e02f5215233c96

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page