Skip to main content

Utilities to benchmark datacubes with various formats, compressions, and chunking schemes.

Project description

datacube-benchmark

Docs PyPI License: MIT

Utilities for benchmarking Zarr datacubes — generate synthetic stores with different chunking schemes, compressors, and dtypes, then measure read performance under realistic access patterns.

Companion package to the Datacube Guide, which documents common pitfalls when producing and consuming multi-dimensional data products.

Installation

pip install datacube-benchmark

Python 3.12+ is required.

Quickstart

Create a synthetic Zarr store on local disk and time a few random-access patterns against it:

from pathlib import Path

import obstore as obs
import zarr

import datacube_benchmark

path = Path.cwd() / "data" / "test.zarr"
path.mkdir(parents=True, exist_ok=True)
store = obs.store.LocalStore(str(path))
zarr_store = datacube_benchmark.create_zarr_store(store)

arr = zarr.open_array(zarr_store, zarr_version=3, path="data")
results = datacube_benchmark.benchmark_access_patterns(arr, num_samples=10)
print(results)

create_zarr_store takes target sizes and chunk shapes as strings or pint quantities (e.g. "1 GB", "10 MB"), and writes through an obstore store — so the same call works against a local directory, S3, GCS, or Azure by swapping the store.

What's in the box

  • create_zarr_store, create_or_open_zarr_store, create_or_open_zarr_array, create_empty_dataarray — build synthetic Zarr datacubes at a target size, resolution, and chunk shape.
  • benchmark_zarr_array — time random reads against one access pattern ("point", "time_series", "spatial_slice", "full") and return summary statistics with units attached.
  • benchmark_access_patterns — run all four access patterns and return the combined results as a pandas.DataFrame.
  • benchmark_dataset_open — time xarray.open_dataset on a Zarr store.
  • Config — a dataclass collecting the common knobs (compressor, target array size, sample counts, concurrency).

See the API reference for the full signatures and parameter docs.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datacube_benchmark-0.1.0.tar.gz (109.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datacube_benchmark-0.1.0-py3-none-any.whl (13.5 kB view details)

Uploaded Python 3

File details

Details for the file datacube_benchmark-0.1.0.tar.gz.

File metadata

  • Download URL: datacube_benchmark-0.1.0.tar.gz
  • Upload date:
  • Size: 109.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for datacube_benchmark-0.1.0.tar.gz
Algorithm Hash digest
SHA256 50bee51dd6e5373b465b4b3745110c766b9d275af44406a18c95221bb729dc04
MD5 6ebe7cf4d6f1d0ac66ea161b5eb60554
BLAKE2b-256 cf6a6bf5599c0644a8498cfd3c0fab67fe566a63bb829b10e91642ca78b69d08

See more details on using hashes here.

Provenance

The following attestation bundles were made for datacube_benchmark-0.1.0.tar.gz:

Publisher: release.yml on developmentseed/datacube-benchmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file datacube_benchmark-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for datacube_benchmark-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 69572ac5d8c9d900b8734484061b9186c560faddb0be429831382a2c525f185a
MD5 573748f1945cffc28622ba9848f14222
BLAKE2b-256 18861f36215ef253cd3a8d91e6e3a907deb9ac532631ac55553c8ac8d6293fc0

See more details on using hashes here.

Provenance

The following attestation bundles were made for datacube_benchmark-0.1.0-py3-none-any.whl:

Publisher: release.yml on developmentseed/datacube-benchmark

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page