Skip to main content

Integrity checks and creation helpers for Zarr v3 stores

Project description

xzarrguard

xzarrguard solves the ambiguity of interpreting missing chunk files as NaN, and provides concise APIs and a CLI to validate completeness of Zarr v3 stores, create local stores with explicit no-data policy, and convert between manifest/materialized no-data representations.

Install

PyPI: pip install xzarrguard
PyPI + S3 support: pip install "xzarrguard[s3]"
conda: conda install xzarrguard
from source: pip install .

Install-free CLI usage

uv: uvx xzarrguard check /path/to/store.zarr
pixi: pixi exec xzarrguard check /path/to/store.zarr

Remote check uses fsspec backends. For S3-compatible stores:

xzarrguard check "s3://example-bucket/path/to/store.zarr" \
  --profile example-profile \
  --endpoint-url "https://object-store.example.com"

API quickstart

from xzarrguard import check_store, create_store

report = check_store("store.zarr")
if report:
    print("store is complete")
remote_report = check_store(
    "s3://example-bucket/path/to/store.zarr",
    storage_options={
        "profile": "example-profile",
        "client_kwargs": {"endpoint_url": "https://object-store.example.com"},
    },
)
create_store(
    dataset,
    "store.zarr",
    no_data_chunks={"temperature": [(0, 0)]},
    no_data_strategy="manifest",
)

Write and guard in one step (wrapper around .to_zarr()):

from xzarrguard import guarded_to_zarr

guarded_to_zarr(dataset, "store.zarr")

Recommended distributed-write workflow:

  1. Use upstream xarray.Dataset.to_zarr(..., write_empty_chunks=True) during the distributed write phase so workers materialize chunk keys deterministically.
  2. Finalize with xzarrguard conversion to derive compact manifests from no-data chunks.
from xzarrguard import convert_store

convert_store("store.zarr", direction="auto")

In-place metadata-only guard update (no chunk rewrite):

create_store(
    None,
    "store.zarr",
    no_data_chunks={"temperature": [(0, 0)]},
    in_place_metadata_only=True,
)

Treat the current store as baseline and derive allowed-missing chunks from what is currently missing:

create_store(
    None,
    "store.zarr",
    in_place_metadata_only=True,
    infer_no_data_from_store=True,
)

CLI quickstart

xzarrguard check store.zarr
xzarrguard check "s3://example-bucket/path/to/store.zarr" --profile example-profile --endpoint-url "https://object-store.example.com"
xzarrguard create source.zarr target.zarr --no-data no_data.json
xzarrguard create store.zarr --in-place-metadata-only --no-data no_data.json
xzarrguard create store.zarr --in-place-metadata-only --infer-no-data-from-store
xzarrguard convert store.zarr
xzarrguard convert store.zarr --direction manifest_to_materialized

Note: you may see ZarrUserWarning: Object at .xzarrguard is not recognized as a component of a Zarr hierarchy. when tooling walks the store hierarchy. This is expected: .xzarrguard/ is xzarrguard sidecar metadata, not a Zarr array/group node.

Coverage

pytest

pytest prints terminal coverage and writes coverage.xml.

Documentation

https://j-haacker.github.io/xzarrguard/

zensical serve
zensical build --clean

Channels

Release (maintainers)

# bump src/xzarrguard/_version.py first
python -m build
python -m twine check dist/*
python -m twine upload dist/*

Use a PyPI API token for upload (for example TWINE_USERNAME=__token__). For conda-forge, update recipe/recipe.yaml after the PyPI release (fixed version + PyPI sdist URL + sha256), then submit a recipe/feedstock PR.

Acknowledgement: Initial scaffolding and implementation assistance by OpenAI Codex.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

xzarrguard-0.1.2.tar.gz (26.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

xzarrguard-0.1.2-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file xzarrguard-0.1.2.tar.gz.

File metadata

  • Download URL: xzarrguard-0.1.2.tar.gz
  • Upload date:
  • Size: 26.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xzarrguard-0.1.2.tar.gz
Algorithm Hash digest
SHA256 68e9c07e9c78d077367daa96f121dee6ed26a06080b36e040740c2e5cd63d6f5
MD5 a6c191d80d24d9ce7983a047ea802b5a
BLAKE2b-256 3826bb492d117bdb21f0621dd6f8684d8e9d67d3a91022be880f7e433c6309eb

See more details on using hashes here.

Provenance

The following attestation bundles were made for xzarrguard-0.1.2.tar.gz:

Publisher: pypi-publish.yml on j-haacker/xzarrguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file xzarrguard-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: xzarrguard-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for xzarrguard-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 36e08da48d9adf6fa1c18890e8cedd2c963bc3e402dc9e3ad6c3d31b6ce399a7
MD5 dab06ab925c07f08590f791beeb7b120
BLAKE2b-256 c90034dfab43bb2d8d985f93587f53e3d6d06f0a0dd4899f7a76f5f90de37690

See more details on using hashes here.

Provenance

The following attestation bundles were made for xzarrguard-0.1.2-py3-none-any.whl:

Publisher: pypi-publish.yml on j-haacker/xzarrguard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page