Skip to main content

No project description provided

Project description

zarrs-python

PyPI Downloads Downloads Stars CI CD

This project serves as a bridge between zarrs (Rust) and zarr (zarr-python) via PyO3. The main goal of the project is to speed up i/o (see zarr_benchmarks).

To use the project, simply install our package (which depends on zarr-python>=3.0.0), and run:

import zarr
import zarrs
zarr.config.set({"codec_pipeline.path": "zarrs.ZarrsCodecPipeline"})

You can then use your zarr as normal (with some caveats)!

API

We export a ZarrsCodecPipeline class so that zarr-python can use the class but it is not meant to be instantiated and we do not guarantee the stability of its API beyond what is required so that zarr-python can use it. Therefore, it is not documented here.

At the moment, we only support a subset of the zarr-python stores:

A NotImplementedError will be raised if a store is not supported. We intend to support more stores in the future: https://github.com/ilan-gold/zarrs-python/issues/44.

Configuration

ZarrsCodecPipeline options are exposed through zarr.config.

Standard zarr.config options control some functionality (see the defaults in the config.py of zarr-python):

  • threading.max_workers: the maximum number of threads used internally by the ZarrsCodecPipeline on the Rust side.
  • array.write_empty_chunks: whether or not to store empty chunks.
    • Defaults to false if None. Note that checking for emptiness has some overhead, see here for more info.

The ZarrsCodecPipeline specific options are:

  • codec_pipeline.chunk_concurrent_maximum: the maximum number of chunks stored/retrieved concurrently.
    • Defaults to the number of logical CPUs if None. It is constrained by threading.max_workers as well.
  • codec_pipeline.chunk_concurrent_minimum: the minimum number of chunks retrieved/stored concurrently when balancing chunk/codec concurrency.
    • Defaults to 4 if None. See here for more info.
  • codec_pipeline.validate_checksums: enable checksum validation (e.g. with the CRC32C codec).
    • Defaults to true if None. See here for more info.

For example:

zarr.config.set({
    "threading.max_workers": None,
    "array.write_empty_chunks": False,
    "codec_pipeline": {
        "path": "zarrs.ZarrsCodecPipeline",
        "validate_checksums": True,
        "store_empty_chunks": False,
        "chunk_concurrent_maximum": None,
        "chunk_concurrent_minimum": 4,
    }
})

If the ZarrsCodecPipeline is pickled, and then un-pickled, and during that time one of store_empty_chunks, chunk_concurrent_minimum, chunk_concurrent_maximum, or num_threads has changed, the newly un-pickled version will pick up the new value. However, once a ZarrsCodecPipeline object has been instantiated, these values are then fixed. This may change in the future as guidance from the zarr community becomes clear.

Concurrency

Concurrency can be classified into two types:

  • chunk (outer) concurrency: the number of chunks retrieved/stored concurrently.
    • This is chosen automatically based on various factors, such as the chunk size and codecs.
    • It is constrained between codec_pipeline.chunk_concurrent_minimum and codec_pipeline.chunk_concurrent_maximum for operations involving multiple chunks.
  • codec (inner) concurrency: the number of threads encoding/decoding a chunk.
    • This is chosen automatically in combination with the chunk concurrency.

The product of the chunk and codec concurrency will approximately match threading.max_workers.

Chunk concurrency is typically favored because:

  • parallel encoding/decoding can have a high overhead with some codecs, especially with small chunks, and
  • it is advantageous to retrieve/store multiple chunks concurrently, especially with high latency stores.

zarrs-python will often favor codec concurrency with sharded arrays, as they are well suited to codec concurrency.

Supported Indexing Methods

The following methods will trigger use with the old zarr-python pipeline:

  1. Any oindex or vindex integer np.ndarray indexing with dimensionality >=3 i.e.,

    arr[np.array([...]), :, np.array([...])]
    arr[np.array([...]), np.array([...]), np.array([...])]
    arr[np.array([...]), np.array([...]), np.array([...])] = ...
    arr.oindex[np.array([...]), np.array([...]), np.array([...])] = ...
    
  2. Any vindex or oindex discontinuous integer np.ndarray indexing for writes in 2D

    arr[np.array([0, 5]), :] = ...
    arr.oindex[np.array([0, 5]), :] = ...
    
  3. vindex writes in 2D where both indexers are integer np.ndarray indices i.e.,

    arr[np.array([...]), np.array([...])] = ...
    
  4. Ellipsis indexing. We have tested some, but others fail even with zarr-python's default codec pipeline. Thus for now we advise proceeding with caution here.

    arr[0:10, ..., 0:5]
    

Furthermore, using anything except contiguous (i.e., slices or consecutive integer) np.ndarray for numeric data will fall back to the default zarr-python implementation.

Please file an issue if you believe we have more holes in our coverage than we are aware of or you wish to contribute! For example, we have an issue in zarrs for integer-array indexing that would unblock a lot the use of the rust pipeline for that use-case (very useful for mini-batch training perhaps!).

Further, any codecs not supported by zarrs will also automatically fall back to the python implementation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zarrs-0.1.3.tar.gz (58.0 kB view details)

Uploaded Source

Built Distributions

zarrs-0.1.3-cp311-abi3-win_amd64.whl (2.8 MB view details)

Uploaded CPython 3.11+ Windows x86-64

zarrs-0.1.3-cp311-abi3-win32.whl (2.4 MB view details)

Uploaded CPython 3.11+ Windows x86

zarrs-0.1.3-cp311-abi3-musllinux_1_2_x86_64.whl (9.5 MB view details)

Uploaded CPython 3.11+ musllinux: musl 1.2+ x86-64

zarrs-0.1.3-cp311-abi3-musllinux_1_2_armv7l.whl (8.9 MB view details)

Uploaded CPython 3.11+ musllinux: musl 1.2+ ARMv7l

zarrs-0.1.3-cp311-abi3-musllinux_1_2_aarch64.whl (9.1 MB view details)

Uploaded CPython 3.11+ musllinux: musl 1.2+ ARM64

zarrs-0.1.3-cp311-abi3-manylinux_2_28_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.11+ manylinux: glibc 2.28+ x86-64

zarrs-0.1.3-cp311-abi3-manylinux_2_28_ppc64le.whl (3.2 MB view details)

Uploaded CPython 3.11+ manylinux: glibc 2.28+ ppc64le

zarrs-0.1.3-cp311-abi3-manylinux_2_28_armv7l.whl (2.8 MB view details)

Uploaded CPython 3.11+ manylinux: glibc 2.28+ ARMv7l

zarrs-0.1.3-cp311-abi3-manylinux_2_28_aarch64.whl (3.0 MB view details)

Uploaded CPython 3.11+ manylinux: glibc 2.28+ ARM64

zarrs-0.1.3-cp311-abi3-macosx_11_0_arm64.whl (2.8 MB view details)

Uploaded CPython 3.11+ macOS 11.0+ ARM64

zarrs-0.1.3-cp311-abi3-macosx_10_12_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.11+ macOS 10.12+ x86-64

File details

Details for the file zarrs-0.1.3.tar.gz.

File metadata

  • Download URL: zarrs-0.1.3.tar.gz
  • Upload date:
  • Size: 58.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for zarrs-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0dc93a95432a54e87094de64b3847547bcbe1cf34fa505429412370e08a85fa1
MD5 a248d2a23980f1bdc53eb9136172e36b
BLAKE2b-256 b4307e82e0403d637a55c52ab682e70ca0a68a606ad94831ef21370474389c92

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3.tar.gz:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: zarrs-0.1.3-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 2.8 MB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 f8221d3952b5a7f94b5a8d523e1e719dfc0c276b2a7c92ffe622f4a32567f4c3
MD5 535e5fb8082599bf7f4a5a3057e4a648
BLAKE2b-256 db9d5b51e43ce8813289f9937f07173ea99603b2b33f5106f880ec153f1a5dcb

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-win_amd64.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-win32.whl.

File metadata

  • Download URL: zarrs-0.1.3-cp311-abi3-win32.whl
  • Upload date:
  • Size: 2.4 MB
  • Tags: CPython 3.11+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-win32.whl
Algorithm Hash digest
SHA256 39240d957083c47dd161bc2a91f2cd5be6be338d16b7013a857d5ac9f0abaddd
MD5 93524c526b567dde426d4216805fc44a
BLAKE2b-256 798508a10a10648c8a49a06c02b1d17cf8beff752399bfdcf882df2158f15b14

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-win32.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 68b4878b567827265a63cf82bec19c3f1d2513d9913cf026f13a313718c04189
MD5 d8c0d33fa554c67bfd11a96b3ae1214b
BLAKE2b-256 49e2e46b146b02619b7983b6172217b75f1fcd95322eb545c6a0bee8d94824d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-musllinux_1_2_x86_64.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 05b3abac65236aa526f14d0ab10f6ff72a70fea8e36a47458c947534bda7aff0
MD5 d619c947095f15e4e67e5a56a7bc9a2c
BLAKE2b-256 ed693a8ceeee808359f87d870522f980278fd261b8a4bd5e56a684678aa87128

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-musllinux_1_2_armv7l.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 2e32ab45d1cb38ebd7c133ada231b73c0399269dc8b0befe0c22117cc42736d9
MD5 f409b1d218825bb0bc28db48883ab570
BLAKE2b-256 2acfa2b1262728b6df8042ce8117b24f6d8d14e7bbcb438061a694259bcf9f42

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-musllinux_1_2_aarch64.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ef5971b98804167b4172ed3f4a4ac55241daa1c4cd0c158399295ed414287e9f
MD5 c3a543d0d21a87f417c8f2bdabf71c30
BLAKE2b-256 bb9854a40e503398ccc932036dbd101dc715c062c479d5c990ff04b03576851a

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-manylinux_2_28_x86_64.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-manylinux_2_28_ppc64le.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-manylinux_2_28_ppc64le.whl
Algorithm Hash digest
SHA256 e07ded6151f1621f170d5585bd0612acea80c3cc30a0546645225a413deb2e03
MD5 a51ef63c295592d331218905f35dd8d2
BLAKE2b-256 b6258c5b19904f3bce6d6b78c3b7a57b7b8df2215b65a11527e068cc56f90f75

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-manylinux_2_28_ppc64le.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-manylinux_2_28_armv7l.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-manylinux_2_28_armv7l.whl
Algorithm Hash digest
SHA256 64a9e7480cb3335212bfebde4ab74a19986d47c1e7791d68e4f3248891905bc1
MD5 6edbc5c36d8d350f757e81df0f424928
BLAKE2b-256 23e5cfddb5375dde7c76a4a4455b884ac35f3c18f2be23de103f461cfba91315

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-manylinux_2_28_armv7l.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 8d7a9f15295061789619f2184d1b5fe44f97086baa3f72d079a161fcc42a82a8
MD5 79175ebc3577defd89f8809768d266f9
BLAKE2b-256 5a7a28d6e33c5e245a9154f8cfc294f2d52f426b3ce10403fc44983486d68fb1

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-manylinux_2_28_aarch64.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 68bf45a1c440c218e37d13ba859dd5b6a65dd16bdf1aa5a1e315bddd04ebc7a3
MD5 6afd8610fd31948b3565660c5f071a90
BLAKE2b-256 763bfcc72c528e8a6751a35aa497da29a84c77b82e90b1ec32f8d0d0fe62a8e4

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.3-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.3-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 720fcb02b1bf60dc54c4ba5dfcf0b4142873a535ae1fda03066ac2b873fc8e28
MD5 85c440f4e64d9e94d77afaf9a4458e34
BLAKE2b-256 0cd31d70f4a736ea3e9cee76cdb9b2c30e09971e4ba8a1248581eb1d4e63de14

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.3-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: cd.yml on ilan-gold/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page