Efficient batch statistics computation library for Python.
Project description
BatchStats
batchstats is a Python package for computing statistics on data that arrives in batches. It's perfect for streaming data or datasets too large to fit into memory.
For detailed information, please check out the full documentation.
Installation
Install batchstats using pip:
pip install batchstats
Or with conda:
conda install -c conda-forge batchstats
Quick Start
Here's how to compute the mean and variance of a dataset in batches:
import numpy as np
from batchstats import BatchMean, BatchVar
# Simulate a data stream
data_stream = (np.random.randn(100, 10) for _ in range(10))
# Initialize the stat objects
batch_mean = BatchMean()
batch_var = BatchVar()
# Process each batch
for batch in data_stream:
batch_mean.update_batch(batch)
batch_var.update_batch(batch)
# Get the final result
mean = batch_mean()
variance = batch_var()
print(f"Mean shape: {mean.shape}")
print(f"Variance shape: {variance.shape}")
Handling NaN Values
batchstats provides BatchNan* classes to handle NaN values, similar to numpy's nan* functions.
import numpy as np
from batchstats import BatchNanMean
# Create data with NaNs
data = np.random.randn(1000, 5)
data[::10] = np.nan
# Compute the mean, ignoring NaNs
nan_mean = BatchNanMean().update_batch(data)()
print(f"NaN-aware mean shape: {nan_mean.shape}")
Available Statistics
batchstats supports a variety of common statistics:
BatchSum/BatchNanSumBatchMean/BatchNanMeanBatchVarBatchStdBatchMinBatchMaxBatchPeakToPeakBatchCov
For more details on each class, see the API Reference.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file batchstats-0.5.1.tar.gz.
File metadata
- Download URL: batchstats-0.5.1.tar.gz
- Upload date:
- Size: 12.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2e5c3b600c8023a49b71b428f54c479fcb54b457a6fe1f5874ca344c56c9c46
|
|
| MD5 |
3b74d3a641fb9fd5f42ca643d034d6e6
|
|
| BLAKE2b-256 |
116e342416fea891379364a874eefc90ab803018f363692fc0f9da2e17695fc3
|
File details
Details for the file batchstats-0.5.1-py3-none-any.whl.
File metadata
- Download URL: batchstats-0.5.1-py3-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d7875b0fd1786e6ae5e702e2517320d9c45d4ad1305061d7a5f0bd646818bed
|
|
| MD5 |
3f0838a2aa7a11ec0cad2da0f86c810c
|
|
| BLAKE2b-256 |
faa5dd48017c3cc357946189d4ee34c35e8faf51b19e0a4def3ce4cac41e6797
|