Skip to main content

Tool for robustly creating and updating Zarr datacubes from smaller slices

Project description

 zappend

CI codecov PyPI Version Conda Version Code style: black Binder GitHub License


zappend is a tool written in Python that is used for robustly creating and updating Zarr datacubes from smaller dataset slices. It is built on top of the awesome Python packages xarray and zarr.

Motivation

The objective of zappend is to address recurring memory issues when generating large geospatial datacubes using the Zarr format by subsequently concatenating data slices along an append dimension, e.g., time (the default) for geospatial satellite observations. Each append step is atomic, that is, the append operation is a transaction that can be rolled back, in case the append operation fails. This ensures integrity of the target data cube.

Features

The zappend tool provides the following features:

  • Locking: While the target dataset is being modified, a file lock is created, effectively preventing concurrent dataset modifications.
  • Transaction-based dataset appends: On failure during an append step, the transaction is rolled back, so that the target dataset remains valid and preserves its integrity.
  • Filesystem transparency: The target dataset may be generated and updated in any writable filesystems supported by the fsspec package. The same holds for the slice datasets to be appended.
  • Dataset polling: The tool can be configured to wait for slice datasets to become available.
  • CLI and Python API: The tool can be used in a shell using the zappend command or from Python. When used from Python using the zappend() function, slice datasets can be passed as local file paths, URIs, as datasets of type xarray.Dataset, or as custom zappend.api.SliceSource objects.

More about zappend can be found in its documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zappend-0.4.0.tar.gz (44.7 kB view details)

Uploaded Source

Built Distribution

zappend-0.4.0-py3-none-any.whl (39.8 kB view details)

Uploaded Python 3

File details

Details for the file zappend-0.4.0.tar.gz.

File metadata

  • Download URL: zappend-0.4.0.tar.gz
  • Upload date:
  • Size: 44.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for zappend-0.4.0.tar.gz
Algorithm Hash digest
SHA256 7679f76627a8e66444e1774a456d30bd1ec1f20c71e9159e6a00a7d295be96e0
MD5 428cc441adcdb05adeb1a515dab4bf9e
BLAKE2b-256 843cdb8d1ad64ab188606f8b24a07d94d2f4b1fb81e854f691a0b09e492603ff

See more details on using hashes here.

Provenance

File details

Details for the file zappend-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: zappend-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 39.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for zappend-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 207fa2f93840420e0b2cbafda5104c2d20e4504a7098bffa90ab68103066ee7a
MD5 687e38788a136079a15b1c3522132f4b
BLAKE2b-256 ce49fc50b61429d4d2530bba7269726f1fec375c69bef8a0d1c3eb43d70381c3

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page