Skip to main content

A simple package to support fast statistics.

Project description

speedystats

speedystats is a Python package designed to accelerate NumPy statistical operations using Numba. The package maintains a clean API that mirrors NumPy's interface while providing significant performance improvements through parallel processing.

"Is it possible to use numba to speed up my median computation?"

Yep. Just use speedystats.

"Is it possible to use numba to make np.std go faster?

Yep. Just use speedystats.

Features

  • Drop-in replacement for common NumPy statistical functions
  • Significant performance improvements through Numba optimization
  • Maintains NumPy-like API for easy integration

Installation

pip install speedystats

Quick Start

import numpy as np
import speedystats as fs

# Create some test data
data = np.random.randn(1000, 1000)

# Use just like numpy
mean = fs.mean(data, axis=0)
std = fs.std(data, axis=1)
median = fs.median(data)

# Works with nan values too
data_with_nans = data.copy()
data_with_nans[0, 0] = np.nan
nanmean = fs.nanmean(data_with_nans, axis=0)

Available Functions

  • Basic Statistics: mean, median, std, var, sum
  • Range Statistics: ptp (peak-to-peak)
  • Percentile Functions: percentile, quantile
  • NaN-aware Variants: nanmean, nanmedian, nanstd, nanvar, nansum
  • Additional Functions: average, zscore

Performance Note

While speedystats is designed for performance, the actual speedup depends on your specific use case, data size, and hardware. The package is most effective with:

  • Large arrays (typically > 100,000 elements)
  • Multi-core processors (for parallel execution)
  • Certain methods are sped up much more than numpy
  • Certain axis / dimension combinations get huge speedups, others are usually comparable to numpy

Note: Benchmarking tools for automatic routing to speedystats implementation vs numpy default methods isn't finished --- so you are responsible for determining whether speedystats is faster. Here's an example of how to test it quickly:

from time import time
import numpy as mp
import speedystats as fs

# Suppose you want to test if median is faster for arrays of a certain shape and size
array = np.random.randn(1000, 10000)
axis = 1

# Set repeats to get a better estimate
num_repeats = 20

# numba functions always have to be compiled, which takes a moment,
# but speedystats caches them so you only have to wait once (it's just a second or so)
# so it's good to use a "warmup":
# you'll need to warmup each combination of number dimensions (2 here) and axis
_ = fs.median(np.zeros((10, 10)), axis=axis)

t = time()
for _ in range(num_repeats):
   _ = np.median(array, axis=axis)
numpy_time = time() - t

t = time()
for _ in range(num_repeats):
   _ = fs.median(array, axis=axis)
speedystat_time = time() - t

print("Speedup: ", numpy_time / speedystat_time)

Comprehensive benchmarking tools are under development. When they're released, this will be automated.

Development Status

This package is in beta. While the core functionality is stable, we're actively working on:

  • Comprehensive benchmarking suite
  • Performance optimization guides
  • Additional statistical functions
  • Advanced documentation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the GNU General Public License v3 (GPLv3) - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speedystats-0.0.2.tar.gz (41.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

speedystats-0.0.2-py3-none-any.whl (50.4 kB view details)

Uploaded Python 3

File details

Details for the file speedystats-0.0.2.tar.gz.

File metadata

  • Download URL: speedystats-0.0.2.tar.gz
  • Upload date:
  • Size: 41.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for speedystats-0.0.2.tar.gz
Algorithm Hash digest
SHA256 7f0940839eb87c85464502a142744bcc818460e1339beb4b19c8bc2cb47ee504
MD5 8eb021bd6c1b1d56a07b26706c5d4c5c
BLAKE2b-256 ab5446313f308d3ff508970486dbcdcb73e2e66f529b40d0b533109deee712bb

See more details on using hashes here.

File details

Details for the file speedystats-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: speedystats-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 50.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for speedystats-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3bd32ca6778837619340617a621983a54811327871051f4a91f1da521c4cd8ba
MD5 5d79ebcf635e5edc3df34584c94f2d0c
BLAKE2b-256 81d109995614588ba167e6878f08c8594d491620ec9432fea82e3906b5548990

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page