Skip to main content

Zarr Collection

Project description

This project is a Python library allowing manipulating data partitioned into a collection of Zarr groups.

This collection allows dividing a dataset into several partitions to facilitate acquisitions or updates made from new products. Possible data partitioning is: by date (hour, day, month, etc.) or by sequence.

A collection partitioned by date, with a monthly resolution, may look like on the disk:

collection/
├── year=2022
│    ├── month=01/
│    │    ├── time/
│    │    │    ├── 0.0
│    │    │    ├── .zarray
│    │    │    └── .zattrs
│    │    ├── var1/
│    │    │    ├── 0.0
│    │    │    ├── .zarray
│    │    │    └── .zattrs
│    │    ├── .zattrs
│    │    ├── .zgroup
│    │    └── .zmetadata
│    └── month=02/
│         ├── time/
│         │    ├── 0.0
│         │    ├── .zarray
│         │    └── .zattrs
│         ├── var1/
│         │    ├── 0.0
│         │    ├── .zarray
│         │    └── .zattrs
│         ├── .zattrs
│         ├── .zgroup
│         └── .zmetadata
└── .zcollection

Partition updates can be set to overwrite existing data with new ones or to update them using different strategies.

The Dask library handles the data to scale the treatments quickly.

It is possible to create views on a reference collection, to add and modify variables contained in a reference collection, accessible in reading only.

This library can store data on POSIX, S3, or any other file system supported by the Python library fsspec. Note, however, only POSIX and S3 file systems have been tested.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zcollection-2023.11.0.tar.gz (159.2 kB view details)

Uploaded Source

File details

Details for the file zcollection-2023.11.0.tar.gz.

File metadata

  • Download URL: zcollection-2023.11.0.tar.gz
  • Upload date:
  • Size: 159.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.12.0

File hashes

Hashes for zcollection-2023.11.0.tar.gz
Algorithm Hash digest
SHA256 8c8383a64fd8ee615527b2d473fcbae4195331d07b35006110487b3c8ddaeed0
MD5 5ad44dc65bac6e47f5c9cd4cbb06fcc4
BLAKE2b-256 7326864f177a97b47c56bebd1dc6f73dccd4bfdec26cd07f3fce28f84ecffefa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page