Rust-backed reductions for NumPy arrays (plain + NaN-aware)
Project description
reducers
Reduction functions + Rust(rs), shortname rd.
Rust-backed reduction functions for NumPy arrays - plain (numpy-like) and NaN-aware.
- Full documentation: https://ysbach.github.io/reducers/
- Rust API reference: https://docs.rs/reducers
The Goal of this toy project was:
- much faster than numpy in many use cases,
- much faster than bottleneck in many use cases, and
- especially maximum performance for median and variance calculations, which are often bottlenecks in data processing pipelines.
reducers might be slower than numpy or bottleneck for small arrays. However, the most time-consuming reductions like large arrays or deep stacks, median, percentile or quantile, var and std are frequently several times (>100 times for nanpercentiles) faster than numpy and bottleneck.
Install
pip install reducers
For Rust crate use:
[dependencies]
reducers = "<version>"
After Installation
Run the autotuner once on your machine where reducers will run:
python -m reducers.autotuner
It saves parallel-grain settings for that CPU and workload profile. Future
import reducers calls apply those settings automatically. The built-in
defaults are still valid; use python -m reducers.autotuner --reset to remove
the saved tuning file and return to them.
import numpy as np
import reducers as rd
a = np.array([1.0, 2.0, np.nan, np.inf, 5.0])
rd.mean(a) # nan: plain reducers propagate NaN/inf
rd.nanmean(a) # inf: skip NaN, keep inf
rd.nanmean(a, ignore_inf=True) # finite-only
rd.nanminmax(a) # one fused 1-D scan for nanmin + nanmax
rd.nanpercentile(a, [16, 50, 84])
Axis reductions cover the layouts this package optimizes:
rng = np.random.default_rng(20250311)
stack = rng.normal(size=(31, 256, 256)).astype("f4")
rows = rng.normal(size=(256, 256, 31)).astype("f4")
rd.nanmedian(stack, axis=0) # stack reduction -> shape (256, 256)
rd.nanmean(rows, axis=-1) # contiguous trailing-axis reduction
- For
[nan]varand[nan]std,return_mean=Truereturns the already-computed mean with the variance or standard deviation to avoid duplicate work when both are needed. [nan]sum(a, weights=w)can do similar:return_sum_weights=Trueandreturn_unweighted_sum=Trueexpose quantities already available during the fused weighted scan, avoiding separatesum(a * w),sum(w), orsum(a)passes when a caller needs them together.
std, mean = rd.nanstd(a, ddof=1, return_mean=True)
weighted_sum, sum_of_weights = rd.nansum(a, weights=w, return_sum_weights=True)
weighted_sum, unweighted_sum = rd.nansum(a, weights=w, return_unweighted_sum=True)
weighted_sum, unweighted_sum, sum_of_weights = rd.nansum(
a, weights=w, return_unweighted_sum=True, return_sum_weights=True
)
Dual use: the kernel modules are pure Rust (no PyO3/NumPy) and usable as a crate.
Maximum-performance Python calls
For fixed production hot loops, import the low-level Python API as rdl:
import reducers.lowlevel as rdl
The rdl functions call the same Rust kernels but skip the high-level Python
normalization layer. Weighted hot loops can choose the narrow fused primitive
for the output terms they actually need:
weighted_sum = rdl.weighted_sum_only_skip_nonfinite(a, w)
weighted_sum, sum_weights = rdl.weighted_sum_and_weights_skip_nonfinite(a, w)
weighted_sum, sum_weights, unweighted_sum = rdl.weighted_sum_skip_nonfinite(a, w)
average = rdl.weighted_average_skip_nonfinite(a, w)
See the documentation for details on achieving maximum performance.
Current limits
-
axismay beNone(default, whole-array),0or-1(identical toa.ndim - 1); other axes raiseNotImplementedError. This keeps hidden transpose/copy costs out of the API and lets the Rust kernels specialize for the supported layouts. -
NumPy-like subset: There are many unsupported parameters like
out,keepdims,where,dtype, or percentilemethod(linear only). Adding them will not likely be considered unless there is a strong use case, as they add complexity and maintenance burden. The main focus is on the core reduction logic and, more importantly, performance.
See the documentation for detailed API semantics, performance notes, axis behavior, and release wheels: https://ysbach.github.io/reducers/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file reducers-0.3.0.tar.gz.
File metadata
- Download URL: reducers-0.3.0.tar.gz
- Upload date:
- Size: 41.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6df3506976b3f96231e31747b77d5db86ee932a7c6c820ed3d1630d41c7e225b
|
|
| MD5 |
7a7c68a72eb9113b2d4abd8fe06ad0e2
|
|
| BLAKE2b-256 |
3121a87a68349e6ea472e3803db313b048180ba98d4063a1102b3b15bf473917
|
File details
Details for the file reducers-0.3.0-cp310-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: reducers-0.3.0-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 1.7 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4febaadb7443b6765e548065f889675673d25539d1d84fd45670552e06914c5f
|
|
| MD5 |
3dadbfee97e830edee15949ef1edcea3
|
|
| BLAKE2b-256 |
ab63acaaaea44c124da7203597d655acf39a7aafcc2ae0ee21a766bf19336227
|