Skip to main content

A simple package to support fast statistics.

Project description

speedystats

speedystats is a Python package designed to accelerate NumPy statistical operations using Numba. The package maintains a clean API that mirrors NumPy's interface while providing significant performance improvements through parallel processing.

"How can I use numba to speed up my median computation?"

Yep. Just use speedystats.

"Is it possible to use numba to make np.std go faster?

Yep. Just use speedystats.

Features

  • Drop-in replacement for common NumPy statistical functions
  • Significant performance improvements through Numba optimization
  • Maintains NumPy-like API for easy integration

Installation

pip install speedystats

Quick Start

import numpy as np
import speedystats as fs

# Create some test data
data = np.random.randn(1000, 1000)

# Use just like numpy
mean = fs.mean(data, axis=0)
std = fs.std(data, axis=1)
median = fs.median(data)

# Works with nan values too
data_with_nans = data.copy()
data_with_nans[0, 0] = np.nan
nanmean = fs.nanmean(data_with_nans, axis=0)

Available Functions

  • Basic Statistics: mean, median, std, var, sum
  • Range Statistics: ptp (peak-to-peak)
  • Percentile Functions: percentile, quantile
  • NaN-aware Variants: nanmean, nanmedian, nanstd, nanvar, nansum
  • Additional Functions: average, zscore

Performance Note

While speedystats is designed for performance, the actual speedup depends on your specific use case, data size, and hardware. The package is most effective with:

  • Large arrays (typically > 100,000 elements)
  • Multi-core processors (for parallel execution)
  • Certain methods are sped up much more than numpy
  • Certain axis / dimension combinations get huge speedups, others are usually comparable to numpy

Note: Benchmarking tools for automatic routing to speedystats implementation vs numpy default methods isn't finished --- so you are responsible for determining whether speedystats is faster. Here's an example of how to test it quickly:

from time import time
import numpy as mp
import speedystats as fs

# Suppose you want to test if median is faster for arrays of a certain shape and size
array = np.random.randn(1000, 10000)
axis = 1

# Set repeats to get a better estimate
num_repeats = 20

t = time()
for _ in range(num_repeats):
   _ = np.median(array, axis=axis)
numpy_time = time() - t

t = time()
for _ in range(num_repeats):
   _ = fs.median(array, axis=axis)
speedystat_time = time() - t

print("Speedup: ", numpy_time / speedystat_time)

Comprehensive benchmarking tools are under development.

Development Status

This package is in beta. While the core functionality is stable, we're actively working on:

  • Comprehensive benchmarking suite
  • Performance optimization guides
  • Additional statistical functions
  • Advanced documentation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the GNU General Public License v3 (GPLv3) - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speedystats-0.0.0.tar.gz (41.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

speedystats-0.0.0-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file speedystats-0.0.0.tar.gz.

File metadata

  • Download URL: speedystats-0.0.0.tar.gz
  • Upload date:
  • Size: 41.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for speedystats-0.0.0.tar.gz
Algorithm Hash digest
SHA256 0958e41c1d2314ae0cb4a988e170e6d8af7becc5154287c5d88c6cec3fd196e8
MD5 8534ecdcbd27bfb708f657aa8235078d
BLAKE2b-256 eccb99fc3ba5ea70481ec9bcabf75ccbf298195d1c379d262ef273d1055e099d

See more details on using hashes here.

File details

Details for the file speedystats-0.0.0-py3-none-any.whl.

File metadata

  • Download URL: speedystats-0.0.0-py3-none-any.whl
  • Upload date:
  • Size: 50.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for speedystats-0.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d28c8f39c6eab4433b8e0b72a5129d1789afbb42893d9da51f788195a40a67ff
MD5 93c1ad6dae29026d56ace76f22f11898
BLAKE2b-256 7b269bc02a1004f8e9de3345ca20f9340f00ac86ff6cf97497114ef783300dd5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page