Skip to main content

Tool for robustly creating and updating Zarr datacubes from smaller slices

Project description

 zappend

CI codecov PyPI Version Conda Version Code style: black Binder GitHub License


zappend is a tool written in Python that is used for robustly creating and updating Zarr datacubes from smaller dataset slices. It is built on top of the awesome Python packages xarray and zarr.

Motivation

The objective of zappend is enabling geodata scientists and developers to robustly create large data cubes. The tool performs transaction-based dataset appends to existing data cubes in the Zarr format. If an error occurs during an append step — typically due to I/O problems or out-of-memory conditions — zappend will automatically roll back the operation, ensuring that the existing data cube maintains its structural integrity. The design drivers behind zappend are first ease of use and secondly, high configurability regarding filesystems, data source types, data cube outline and encoding.

The tool comprises a command-line interface, a Python API for programmatic control, and a comprehensible documentation to guide users effectively. You can easily install zappend as a plain Python package using either pip install zappend or conda install -conda-forge zappend.

Features

The zappend tool provides the following features:

  • Locking: While the target dataset is being modified, a file lock is created, effectively preventing concurrent dataset modifications.
  • Transaction-based dataset appends: On failure during an append step, the transaction is rolled back, so that the target dataset remains valid and preserves its integrity.
  • Filesystem transparency: The target dataset may be generated and updated in any writable filesystems supported by the fsspec package. The same holds for the slice datasets to be appended.
  • Dataset polling: The tool can be configured to wait for slice datasets to become available.
  • Dynamic attributes: Use syntax {{ expression }} to update the target dataset with dynamically computed attribute values.
  • CLI and Python API: The tool can be used in a shell using the zappend command or from Python. When used from Python using the zappend() function, slice datasets can be passed as local file paths, URIs, as datasets of type xarray.Dataset, or as custom slice sources.

More about zappend can be found in its documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zappend-0.7.0.tar.gz (48.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zappend-0.7.0-py3-none-any.whl (47.6 kB view details)

Uploaded Python 3

File details

Details for the file zappend-0.7.0.tar.gz.

File metadata

  • Download URL: zappend-0.7.0.tar.gz
  • Upload date:
  • Size: 48.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for zappend-0.7.0.tar.gz
Algorithm Hash digest
SHA256 62799d150c0f42ee25a1cb419cd0ef2b76279dcfc6eb7308701a0ff17e3723e2
MD5 91efeb95c24c4ba2d77386a7aa6a57df
BLAKE2b-256 bdf7de279d35ae566bc26ec0fcd9b46e31bdb413ba266851701b89c28f7a631c

See more details on using hashes here.

File details

Details for the file zappend-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: zappend-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 47.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.18

File hashes

Hashes for zappend-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 df6b465b7c5c1350757fc8a5871ad90c0277c15cd127864ff951252cdbf297ee
MD5 b47859a64d5d8a82433f8125bf40fede
BLAKE2b-256 1763f5502243c78d7269959c74676ad89608d81c4dcb78eb002a9f0cc64254f6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page