Skip to main content

Rust-backed reductions for NumPy arrays (plain + NaN-aware)

Project description

reducers

Reduction functions + Rust(rs), shortname rd.

Rust-backed reduction functions for NumPy arrays - plain (numpy-like) and NaN-aware.

The Goal of this toy project was:

  1. much faster than numpy in many use cases,
  2. much faster than bottleneck in many use cases, and
  3. especially maximum performance for median and variance calculations, which are often bottlenecks in data processing pipelines.

reducers might be slower than numpy or bottleneck for small arrays. However, the most time-consuming reductions like large arrays or deep stacks, median, percentile or quantile, var and std are frequently several times (>100 times for nanpercentiles) faster than numpy and bottleneck.

Install

pip install reducers

For Rust crate use:

[dependencies]
reducers = "<version>"

After Installation

Run the autotuner once on your machine where reducers will run:

python -m reducers.autotuner

It saves parallel-grain settings for that CPU and workload profile. Future import reducers calls apply those settings automatically. The built-in defaults are still valid; use python -m reducers.autotuner --reset to remove the saved tuning file and return to them.

import numpy as np
import reducers as rd

a = np.array([1.0, 2.0, np.nan, np.inf, 5.0])

rd.mean(a)                      # nan: plain reducers propagate NaN/inf
rd.nanmean(a)                   # inf: skip NaN, keep inf
rd.nanmean(a, ignore_inf=True)  # finite-only
rd.nanminmax(a)                 # one fused 1-D scan for nanmin + nanmax
rd.nanpercentile(a, [16, 50, 84])

Axis reductions cover the layouts this package optimizes:

rng = np.random.default_rng(20250311)
stack = rng.normal(size=(31, 256, 256)).astype("f4")
rows = rng.normal(size=(256, 256, 31)).astype("f4")

rd.nanmedian(stack, axis=0)      # stack reduction -> shape (256, 256)
rd.nanmean(rows, axis=-1)        # contiguous trailing-axis reduction
  • For [nan]var and [nan]std, return_mean=True returns the already-computed mean with the variance or standard deviation to avoid duplicate work when both are needed.
  • [nan]sum(a, weights=w) can do similar: return_sum_weights=True and return_unweighted_sum=True expose quantities already available during the fused weighted scan, avoiding separate sum(a * w), sum(w), or sum(a) passes when a caller needs them together.
std, mean = rd.nanstd(a, ddof=1, return_mean=True)
weighted_sum, sum_of_weights = rd.nansum(a, weights=w, return_sum_weights=True)
weighted_sum, unweighted_sum = rd.nansum(a, weights=w, return_unweighted_sum=True)
weighted_sum, unweighted_sum, sum_of_weights = rd.nansum(
    a, weights=w, return_unweighted_sum=True, return_sum_weights=True
)

Dual use: the kernel modules are pure Rust (no PyO3/NumPy) and usable as a crate.

Maximum-performance Python calls

For fixed production hot loops, import the low-level Python API as rdl:

import reducers.lowlevel as rdl

The rdl functions call the same Rust kernels but skip the high-level Python normalization layer. Weighted hot loops can choose the narrow fused primitive for the output terms they actually need:

weighted_sum = rdl.weighted_sum_only_skip_nonfinite(a, w)
weighted_sum, sum_weights = rdl.weighted_sum_and_weights_skip_nonfinite(a, w)
weighted_sum, sum_weights, unweighted_sum = rdl.weighted_sum_skip_nonfinite(a, w)
average = rdl.weighted_average_skip_nonfinite(a, w)

See the documentation for details on achieving maximum performance.

Current limits

  • axis may be None (default, whole-array), 0 or -1 (identical to a.ndim - 1); other axes raise NotImplementedError. This keeps hidden transpose/copy costs out of the API and lets the Rust kernels specialize for the supported layouts.

  • NumPy-like subset: There are many unsupported parameters like out, keepdims, where, dtype, or percentile method (linear only). Adding them will not likely be considered unless there is a strong use case, as they add complexity and maintenance burden. The main focus is on the core reduction logic and, more importantly, performance.

See the documentation for detailed API semantics, performance notes, axis behavior, and release wheels: https://ysbach.github.io/reducers/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reducers-0.3.0.tar.gz (41.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

reducers-0.3.0-cp310-abi3-macosx_11_0_arm64.whl (1.7 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

File details

Details for the file reducers-0.3.0.tar.gz.

File metadata

  • Download URL: reducers-0.3.0.tar.gz
  • Upload date:
  • Size: 41.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for reducers-0.3.0.tar.gz
Algorithm Hash digest
SHA256 6df3506976b3f96231e31747b77d5db86ee932a7c6c820ed3d1630d41c7e225b
MD5 7a7c68a72eb9113b2d4abd8fe06ad0e2
BLAKE2b-256 3121a87a68349e6ea472e3803db313b048180ba98d4063a1102b3b15bf473917

See more details on using hashes here.

File details

Details for the file reducers-0.3.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for reducers-0.3.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4febaadb7443b6765e548065f889675673d25539d1d84fd45670552e06914c5f
MD5 3dadbfee97e830edee15949ef1edcea3
BLAKE2b-256 ab63acaaaea44c124da7203597d655acf39a7aafcc2ae0ee21a766bf19336227

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page