Skip to main content

A simple package to support fast statistics.

Project description

speedystats

speedystats is a Python package designed to accelerate NumPy statistical operations using Numba. The package maintains a clean API that mirrors NumPy's interface while providing significant performance improvements through parallel processing.

"Is it possible to use numba to speed up my median computation?"

Yep. Just use speedystats.

"Is it possible to use numba to make np.std go faster?

Yep. Just use speedystats.

Features

  • Drop-in replacement for common NumPy statistical functions
  • Significant performance improvements through Numba optimization
  • Maintains NumPy-like API for easy integration

Installation

pip install speedystats

Quick Start

import numpy as np
import speedystats as fs

# Create some test data
data = np.random.randn(1000, 1000)

# Use just like numpy
mean = fs.mean(data, axis=0)
std = fs.std(data, axis=1)
median = fs.median(data)

# Works with nan values too
data_with_nans = data.copy()
data_with_nans[0, 0] = np.nan
nanmean = fs.nanmean(data_with_nans, axis=0)

Available Functions

  • Basic Statistics: mean, median, std, var, sum
  • Range Statistics: ptp (peak-to-peak)
  • Percentile Functions: percentile, quantile
  • NaN-aware Variants: nanmean, nanmedian, nanstd, nanvar, nansum
  • Additional Functions: average, zscore

Performance Note

While speedystats is designed for performance, the actual speedup depends on your specific use case, data size, and hardware. The package is most effective with:

  • Large arrays (typically > 100,000 elements)
  • Multi-core processors (for parallel execution)
  • Certain methods are sped up much more than numpy
  • Certain axis / dimension combinations get huge speedups, others are usually comparable to numpy

Note: Benchmarking tools for automatic routing to speedystats implementation vs numpy default methods isn't finished --- so you are responsible for determining whether speedystats is faster. Here's an example of how to test it quickly:

from time import time
import numpy as mp
import speedystats as fs

# Suppose you want to test if median is faster for arrays of a certain shape and size
array = np.random.randn(1000, 10000)
axis = 1

# Set repeats to get a better estimate
num_repeats = 20

# numba functions always have to be compiled, which takes a moment,
# but speedystats caches them so you only have to wait once (it's just a second or so)
# so it's good to use a "warmup":
# you'll need to warmup each combination of number dimensions (2 here) and axis
_ = fs.median(np.zeros((10, 10)), axis=axis)

t = time()
for _ in range(num_repeats):
   _ = np.median(array, axis=axis)
numpy_time = time() - t

t = time()
for _ in range(num_repeats):
   _ = fs.median(array, axis=axis)
speedystat_time = time() - t

print("Speedup: ", numpy_time / speedystat_time)

Comprehensive benchmarking tools are under development. When they're released, this will be automated.

Development Status

This package is in beta. While the core functionality is stable, we're actively working on:

  • Comprehensive benchmarking suite
  • Performance optimization guides
  • Additional statistical functions
  • Advanced documentation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the GNU General Public License v3 (GPLv3) - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speedystats-0.1.0.tar.gz (41.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

speedystats-0.1.0-py3-none-any.whl (50.4 kB view details)

Uploaded Python 3

File details

Details for the file speedystats-0.1.0.tar.gz.

File metadata

  • Download URL: speedystats-0.1.0.tar.gz
  • Upload date:
  • Size: 41.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for speedystats-0.1.0.tar.gz
Algorithm Hash digest
SHA256 110d9cc8c3366763c4d8a9bcb854323ee22500d4016f1e784859757cd6330f40
MD5 101d90696b152ef8a2c2012b5fab899a
BLAKE2b-256 0d4a676a7bcd3072005f31834b478a56d476122f91d466fe69b8a92f539ecbf2

See more details on using hashes here.

File details

Details for the file speedystats-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: speedystats-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 50.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for speedystats-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 36f8943409821fa4dc25c701a156b38ac187816e9f94cdac17f61b390567efb5
MD5 e602321c36d5abb8bec50c84967d44c8
BLAKE2b-256 e5f7f5b8030791bd76861a2d363acca89c16b4c15cee7320c723b2f0a2810c70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page