Skip to main content

Very fast percentile calculation for small-integer dtypes via parallel histogram

Project description

fastpercentile: Memory-bandwidth-bound percentile for small-integer arrays

Tests PyPI version License

np.percentile is O(n) but in practice has a brutal constant factor — on a 1.7-billion-element uint16 volume it takes ~22 s, vs ~0.15 s for np.max. There is no good reason for percentile to be so much slower than max: both can be done in a single pass over the array.

For small-integer dtypes (int8, uint8, int16, uint16) the data only takes one of at most 65 536 distinct values, so a single parallel pass into a histogram captures everything you need to compute any percentile. After the histogram is built, walking the cumulative count to find the bin holding each requested rank costs essentially nothing.

This package implements that, in a few hundred lines of numba. On a 32-thread workstation it runs at DRAM bandwidth — about as fast as np.max, and ~300× faster than np.percentile:

np.max         : 0.148 s
fastpercentile : 0.072 s   <- four percentiles in one pass
np.percentile  : 22.2  s

Auxiliary memory is ~16 MB regardless of input size (32 threads × one 65 536-bin local table each, plus a final reduced histogram), so it adds no measurable RAM pressure on top of the input.

Usage

import numpy as np
import fastpercentile

arr = np.random.randint(0, 65536, size=(305, 96, 69, 846), dtype=np.uint16)

# A scalar percentile
p99 = fastpercentile.percentile(arr, 99)

# Multiple percentiles in a single pass over the data
p1, p50, p99, p99_9 = fastpercentile.percentile(arr, [1, 50, 99, 99.9])

# Or just grab the histogram if you want to do something else with it
hist = fastpercentile.histogram(arr)  # length 65536 for uint16

Results match numpy.percentile(arr, q) with the default 'linear' interpolation method (typically exact for integer inputs).

Supported dtypes

int8, uint8, int16, uint16. Floats and 32/64-bit integers are not supported because a direct histogram is not feasible for them — for those, use numpy.percentile or bottleneck.nanpercentile.

Memory layout

A 4D array loaded by pynrrd is typically Fortran-contiguous. fastpercentile walks raw memory, so it does not care about C vs F order — both are handled as a no-copy view. Arbitrarily-strided arrays fall back to a copy.

Installation

Option 1: pip install from PyPI:

pip install fastpercentile

Option 2: pip install directly from GitHub:

pip install git+https://github.com/jasper-tms/fastpercentile.git

Option 3: First git clone this repo and then pip install it from your clone:

cd ~/repos
git clone https://github.com/jasper-tms/fastpercentile.git
cd fastpercentile
pip install '.[dev]'

Notes on threading

fastpercentile uses every logical core on the machine by default (via numba.get_num_threads()). To limit it for a particular call, pass n_threads=N; to set it globally, use numba.set_num_threads(N) or the NUMBA_NUM_THREADS environment variable. On most systems the workload saturates DRAM bandwidth around nproc / 2 threads, so reserving a few cores for the rest of the machine costs little throughput.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastpercentile-0.1.0.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastpercentile-0.1.0-py3-none-any.whl (7.9 kB view details)

Uploaded Python 3

File details

Details for the file fastpercentile-0.1.0.tar.gz.

File metadata

  • Download URL: fastpercentile-0.1.0.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for fastpercentile-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2bba6e69d98bb6ee4f4c3489cb77f156b19f4571d1471b59359e3756d903754e
MD5 1e9ae56c0ae351200381c38b43ac8088
BLAKE2b-256 f56c836e3243d9112ec74e9edc74a32539b7a9087469710b9f78d453adada62e

See more details on using hashes here.

File details

Details for the file fastpercentile-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: fastpercentile-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for fastpercentile-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cfa78ea325844e000ede7623d1e50233983a3b92d11cf8713cfe2b449ecdffbc
MD5 fddce044201e3e616df974ea5e34584a
BLAKE2b-256 e604614d6269f2098342b981dfcf284792d60613db3ed33ac09ce6c61a5fa151

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page