Compute histograms from XArray data.
Project description
XArray-Histogram
Compute and manipulate histograms from XArray data using BoostHistogram
This package allows to compute histograms from and to XArray data.
It relies on the Boost Histogram library giving better performances compared to numpy.histogram
and the existing xhistogram.
It also brings features such as integer/discrete bins or periodic bins.
Dask arrays are supported.
Vectorized manipulation and analysis of the resulting histogram(s) is provided via an XArray accessor.
Quick examples
Three functions are provided (histogram, histogram2d, and historamdd), similar to those from Numpy:
import xarray_histogram as xh
hist = xh.histogram(data, bins=100, range=(0, 10))
Bins can be specified directly via Boost axes for a finer control. The equivalent of the example above would be:
import boost_histogram.axis as bha
hist = xh.histogram(data, bins=[bha.Regular(100, 0., 10.)])
Multi-dimensional histogram can be computed, here in 2D for instance:
hist = xh.histogramdd(
temp, chlorophyll,
bins=[bha.Regular(100, -5., 40.), bha.Regular(100, 1e-3, 10, transform=bha.transform.log))
)
The histograms can be computed on the whole flattened arrays, but we can apply it to only some dimensions. For instance if we have an array of dimensions (time, lat, lon)
we can retrieve the time evolution of its histogram:
hist = xh.histogram(temp, bins=[bha.Regular(100, 0., 10.)], dims=['lat', 'lon'])
Weights can be applied. Output histogram can be normalized
Accessor
An Xarray accessor is provided to do some vectorized manipulations on histogram data. Simply import xarray_histogram.accessor
, and all arrays can then access methods through the hist
property::
import xarray_histogram.accessor
hist = xh.histogram(temp, ...)
hist.hist.edges()
hist.hist.median()
hist.hist.ppf(q=0.75)
See the documentation for more details.
Documentation
Documentation available at https://xarray-histogram.readthedocs.io
Installation
From PyPI:
pip install xarray-histogram
From source:
git clone https://github.com/Descanonge/xarray-histogram
cd xarray-histogram
pip install -e .
TODO
Some features of Boost are not yet available:
- Growing axes: Dask requires to know in advance the size of output chunks. This could reasonably be supported, at least when applying over the whole array (no looping dimensions).
- Advanced storage/accumulators: they provide additional values on top of the count of samples falling into a bin. They require more than one number per bin, and a more complex sum of two histograms (possibly making histogram along chunked dimensions impossible).
- The Unified Histogram Indexing could be implemented in the accessor to facilitate manipulation of histogram arrays.
Requirements
- Python >= 3.11
- numpy
- xarray
- boost-histogram
- dask (optional)
- scipy (optional, for accessor)
Tests and performance
To compare performances check this notebook.
Other packages
xhistogram already exists. It relies on Numpy functions (searchsorted) and thus does not benefit of some performance upgrades brought by Boost (see performance comparisons).
dask-histogram ports Boost-histogram for Dask. It does not support multi-dimensional arrays: one can still reshape the input array but this can incur performance penalties. Still, as it works directly with boost objects rather than Dask arrays all features of Boost should be available.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file xarray_histogram-0.2.1.tar.gz
.
File metadata
- Download URL: xarray_histogram-0.2.1.tar.gz
- Upload date:
- Size: 21.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5dc28d99b58ccfe5f4c0271320c191730ef401bc7520943279c393380cc673e5 |
|
MD5 | 8b3cb974b6a7303d6d3e9b1e1e4bf3a5 |
|
BLAKE2b-256 | b558aeb4d7c55fb94b2ce846ccd4840197db0e7e1176efc7e2b8e48e59f5267c |
Provenance
The following attestation bundles were made for xarray_histogram-0.2.1.tar.gz
:
Publisher:
publish-to-pypi.yml
on Descanonge/xarray-histogram
-
Statement:
- Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
xarray_histogram-0.2.1.tar.gz
- Subject digest:
5dc28d99b58ccfe5f4c0271320c191730ef401bc7520943279c393380cc673e5
- Sigstore transparency entry: 187804076
- Sigstore integration time:
- Permalink:
Descanonge/xarray-histogram@0d194c3ff7184d8141b3f5cefe3b5e34c75e7ae8
- Branch / Tag:
refs/tags/v0.2.1
- Owner: https://github.com/Descanonge
- Access:
public
- Token Issuer:
https://token.actions.githubusercontent.com
- Runner Environment:
github-hosted
- Publication workflow:
publish-to-pypi.yml@0d194c3ff7184d8141b3f5cefe3b5e34c75e7ae8
- Trigger Event:
release
- Statement type:
File details
Details for the file xarray_histogram-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: xarray_histogram-0.2.1-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f761e9d4e0126a1be39340b383465a6664b90e25b6d9be0c10d0f1e1094a818 |
|
MD5 | cb720a5793dfab911c13706767a3a2d1 |
|
BLAKE2b-256 | 1dc90285fa97c176424a6ae7f08047347ee10591a47171b9c2e008bed2e1ab52 |
Provenance
The following attestation bundles were made for xarray_histogram-0.2.1-py3-none-any.whl
:
Publisher:
publish-to-pypi.yml
on Descanonge/xarray-histogram
-
Statement:
- Statement type:
https://in-toto.io/Statement/v1
- Predicate type:
https://docs.pypi.org/attestations/publish/v1
- Subject name:
xarray_histogram-0.2.1-py3-none-any.whl
- Subject digest:
3f761e9d4e0126a1be39340b383465a6664b90e25b6d9be0c10d0f1e1094a818
- Sigstore transparency entry: 187804090
- Sigstore integration time:
- Permalink:
Descanonge/xarray-histogram@0d194c3ff7184d8141b3f5cefe3b5e34c75e7ae8
- Branch / Tag:
refs/tags/v0.2.1
- Owner: https://github.com/Descanonge
- Access:
public
- Token Issuer:
https://token.actions.githubusercontent.com
- Runner Environment:
github-hosted
- Publication workflow:
publish-to-pypi.yml@0d194c3ff7184d8141b3f5cefe3b5e34c75e7ae8
- Trigger Event:
release
- Statement type: