Utilities to benchmark datacubes with various formats, compressions, and chunking schemes.
Project description
datacube-benchmark
Utilities for benchmarking Zarr datacubes — generate synthetic stores with different chunking schemes, compressors, and dtypes, then measure read performance under realistic access patterns.
Companion package to the Datacube Guide, which documents common pitfalls when producing and consuming multi-dimensional data products.
Installation
pip install datacube-benchmark
Python 3.12+ is required.
Quickstart
Create a synthetic Zarr store on local disk and time a few random-access patterns against it:
from pathlib import Path
import obstore as obs
import zarr
import datacube_benchmark
path = Path.cwd() / "data" / "test.zarr"
path.mkdir(parents=True, exist_ok=True)
store = obs.store.LocalStore(str(path))
zarr_store = datacube_benchmark.create_zarr_store(store)
arr = zarr.open_array(zarr_store, zarr_version=3, path="data")
results = datacube_benchmark.benchmark_access_patterns(arr, num_samples=10)
print(results)
create_zarr_store takes target sizes and chunk shapes as strings or
pint quantities (e.g. "1 GB",
"10 MB"), and writes through an obstore
store — so the same call works against a local directory, S3, GCS, or
Azure by swapping the store.
What's in the box
create_zarr_store,create_or_open_zarr_store,create_or_open_zarr_array,create_empty_dataarray— build synthetic Zarr datacubes at a target size, resolution, and chunk shape.benchmark_zarr_array— time random reads against one access pattern ("point","time_series","spatial_slice","full") and return summary statistics with units attached.benchmark_access_patterns— run all four access patterns and return the combined results as apandas.DataFrame.benchmark_dataset_open— timexarray.open_dataseton a Zarr store.Config— a dataclass collecting the common knobs (compressor, target array size, sample counts, concurrency).
See the API reference for the full signatures and parameter docs.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datacube_benchmark-0.1.0.tar.gz.
File metadata
- Download URL: datacube_benchmark-0.1.0.tar.gz
- Upload date:
- Size: 109.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
50bee51dd6e5373b465b4b3745110c766b9d275af44406a18c95221bb729dc04
|
|
| MD5 |
6ebe7cf4d6f1d0ac66ea161b5eb60554
|
|
| BLAKE2b-256 |
cf6a6bf5599c0644a8498cfd3c0fab67fe566a63bb829b10e91642ca78b69d08
|
Provenance
The following attestation bundles were made for datacube_benchmark-0.1.0.tar.gz:
Publisher:
release.yml on developmentseed/datacube-benchmark
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datacube_benchmark-0.1.0.tar.gz -
Subject digest:
50bee51dd6e5373b465b4b3745110c766b9d275af44406a18c95221bb729dc04 - Sigstore transparency entry: 1688604802
- Sigstore integration time:
-
Permalink:
developmentseed/datacube-benchmark@5706d26236a8a1e9d5ad4a36eda725fc02d330c7 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/developmentseed
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5706d26236a8a1e9d5ad4a36eda725fc02d330c7 -
Trigger Event:
release
-
Statement type:
File details
Details for the file datacube_benchmark-0.1.0-py3-none-any.whl.
File metadata
- Download URL: datacube_benchmark-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69572ac5d8c9d900b8734484061b9186c560faddb0be429831382a2c525f185a
|
|
| MD5 |
573748f1945cffc28622ba9848f14222
|
|
| BLAKE2b-256 |
18861f36215ef253cd3a8d91e6e3a907deb9ac532631ac55553c8ac8d6293fc0
|
Provenance
The following attestation bundles were made for datacube_benchmark-0.1.0-py3-none-any.whl:
Publisher:
release.yml on developmentseed/datacube-benchmark
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
datacube_benchmark-0.1.0-py3-none-any.whl -
Subject digest:
69572ac5d8c9d900b8734484061b9186c560faddb0be429831382a2c525f185a - Sigstore transparency entry: 1688604868
- Sigstore integration time:
-
Permalink:
developmentseed/datacube-benchmark@5706d26236a8a1e9d5ad4a36eda725fc02d330c7 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/developmentseed
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5706d26236a8a1e9d5ad4a36eda725fc02d330c7 -
Trigger Event:
release
-
Statement type: