Skip to main content

Tool for creating and updating a Zarr datacubes from smaller slices

Project description

CI codecov PyPI - Version Code style: black GitHub License

zappend

zappend is a tool written in Python that is used for robustly creating and updating Zarr datacubes from smaller dataset slices. It is build on top of the awesome Python packages xarray and zarr.

Motivation

The objective of zappend is to address recurring memory issues when generating large geospatial datacubes using the Zarr format by subsequently concatenating data slices along an append dimension, usually time. Each append step is atomic, that is, the append operation is a transaction that can be rolled back, in case the append operation fails. This ensures integrity of the target data cube.

Features

The zappend tool provides the following features:

  • Locking: While the target dataset is being modified, a file lock is created, effectively preventing concurrent dataset modifications.
  • Transaction-based dataset appends: On failure during an append step, the transaction is rolled back, so that the target dataset remains valid and preserves its integrity.
  • Filesystem transparency: The target dataset may be generated and updated in any writable filesystems supported by the fsspec package. The same holds for the slice datasets to be appended.
  • Dataset polling: The tool can be configured to wait for slice datasets to become available.
  • CLI and Python API: The tool can be used in a shell using the zappend command or from Python. When used from Python using the zappend() function, slice datasets can be passed as local file paths, URIs, or as in-memory datasets of type xarray.Dataset. Users can implement their own slice sources and provide them to the that provide slice dataset objects and are disposed after each slice has been processed.

More about zappend can be found in its documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zappend-0.2.0.tar.gz (36.6 kB view details)

Uploaded Source

Built Distribution

zappend-0.2.0-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file zappend-0.2.0.tar.gz.

File metadata

  • Download URL: zappend-0.2.0.tar.gz
  • Upload date:
  • Size: 36.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for zappend-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a58adedf3c8e5272474b1757552a32d4bb0c33bed3027dba8fcc64b1fd757c3c
MD5 e3d3cf0e341783679735c25886ac1cb8
BLAKE2b-256 1f7797470c7edc6046d85e6c4f301083876bc5a3c94aee8ab5e8316f08368f4f

See more details on using hashes here.

Provenance

File details

Details for the file zappend-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: zappend-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 33.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for zappend-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ec6b324e451a4b726e0e865b7636d98b7204005b231aa3fb165afb1be32ef20
MD5 0aa4ff17977dbcdb73acf94c559e1cdf
BLAKE2b-256 1454b48bd26dc462fc48646d24b29ea2ea8ec9821c18087ef121221b2a627eab

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page