Skip to main content

No project description provided

Project description

zarrs-python

PyPI Downloads Downloads Stars CI CD

This project serves as a bridge between zarrs (Rust) and zarr (zarr-python) via PyO3. The main goal of the project is to speed up i/o (see zarr_benchmarks).

To use the project, simply install our package (which depends on zarr-python>=3.0.0), and run:

import zarr
import zarrs
zarr.config.set({"codec_pipeline.path": "zarrs.ZarrsCodecPipeline"})

You can then use your zarr as normal (with some caveats)!

API

We export a ZarrsCodecPipeline class so that zarr-python can use the class but it is not meant to be instantiated and we do not guarantee the stability of its API beyond what is required so that zarr-python can use it. Therefore, it is not documented here.

At the moment, we only support a subset of the zarr-python stores:

A NotImplementedError will be raised if a store is not supported. We intend to support more stores in the future: https://github.com/zarrs/zarrs-python/issues/44.

Configuration

ZarrsCodecPipeline options are exposed through zarr.config.

Standard zarr.config options control some functionality (see the defaults in the config.py of zarr-python):

  • threading.max_workers: the maximum number of threads used internally by the ZarrsCodecPipeline on the Rust side.
  • array.write_empty_chunks: whether or not to store empty chunks.
    • Defaults to false if None. Note that checking for emptiness has some overhead, see here for more info.

The ZarrsCodecPipeline specific options are:

  • codec_pipeline.chunk_concurrent_maximum: the maximum number of chunks stored/retrieved concurrently.
    • Defaults to the number of logical CPUs if None. It is constrained by threading.max_workers as well.
  • codec_pipeline.chunk_concurrent_minimum: the minimum number of chunks retrieved/stored concurrently when balancing chunk/codec concurrency.
    • Defaults to 4 if None. See here for more info.
  • codec_pipeline.validate_checksums: enable checksum validation (e.g. with the CRC32C codec).
    • Defaults to true if None. See here for more info.

For example:

zarr.config.set({
    "threading.max_workers": None,
    "array.write_empty_chunks": False,
    "codec_pipeline": {
        "path": "zarrs.ZarrsCodecPipeline",
        "validate_checksums": True,
        "store_empty_chunks": False,
        "chunk_concurrent_maximum": None,
        "chunk_concurrent_minimum": 4,
    }
})

If the ZarrsCodecPipeline is pickled, and then un-pickled, and during that time one of store_empty_chunks, chunk_concurrent_minimum, chunk_concurrent_maximum, or num_threads has changed, the newly un-pickled version will pick up the new value. However, once a ZarrsCodecPipeline object has been instantiated, these values are then fixed. This may change in the future as guidance from the zarr community becomes clear.

Concurrency

Concurrency can be classified into two types:

  • chunk (outer) concurrency: the number of chunks retrieved/stored concurrently.
    • This is chosen automatically based on various factors, such as the chunk size and codecs.
    • It is constrained between codec_pipeline.chunk_concurrent_minimum and codec_pipeline.chunk_concurrent_maximum for operations involving multiple chunks.
  • codec (inner) concurrency: the number of threads encoding/decoding a chunk.
    • This is chosen automatically in combination with the chunk concurrency.

The product of the chunk and codec concurrency will approximately match threading.max_workers.

Chunk concurrency is typically favored because:

  • parallel encoding/decoding can have a high overhead with some codecs, especially with small chunks, and
  • it is advantageous to retrieve/store multiple chunks concurrently, especially with high latency stores.

zarrs-python will often favor codec concurrency with sharded arrays, as they are well suited to codec concurrency.

Supported Indexing Methods

The following methods will trigger use with the old zarr-python pipeline:

  1. Any oindex or vindex integer np.ndarray indexing with dimensionality >=3 i.e.,

    arr[np.array([...]), :, np.array([...])]
    arr[np.array([...]), np.array([...]), np.array([...])]
    arr[np.array([...]), np.array([...]), np.array([...])] = ...
    arr.oindex[np.array([...]), np.array([...]), np.array([...])] = ...
    
  2. Any vindex or oindex discontinuous integer np.ndarray indexing for writes in 2D

    arr[np.array([0, 5]), :] = ...
    arr.oindex[np.array([0, 5]), :] = ...
    
  3. vindex writes in 2D where both indexers are integer np.ndarray indices i.e.,

    arr[np.array([...]), np.array([...])] = ...
    
  4. Ellipsis indexing. We have tested some, but others fail even with zarr-python's default codec pipeline. Thus for now we advise proceeding with caution here.

    arr[0:10, ..., 0:5]
    

Furthermore, using anything except contiguous (i.e., slices or consecutive integer) np.ndarray for numeric data will fall back to the default zarr-python implementation.

Please file an issue if you believe we have more holes in our coverage than we are aware of or you wish to contribute! For example, we have an issue in zarrs for integer-array indexing that would unblock a lot the use of the rust pipeline for that use-case (very useful for mini-batch training perhaps!).

Further, any codecs not supported by zarrs will also automatically fall back to the python implementation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zarrs-0.1.5.tar.gz (59.8 kB view details)

Uploaded Source

Built Distributions

zarrs-0.1.5-cp311-abi3-win_amd64.whl (3.8 MB view details)

Uploaded CPython 3.11+Windows x86-64

zarrs-0.1.5-cp311-abi3-win32.whl (3.3 MB view details)

Uploaded CPython 3.11+Windows x86

zarrs-0.1.5-cp311-abi3-musllinux_1_2_x86_64.whl (10.6 MB view details)

Uploaded CPython 3.11+musllinux: musl 1.2+ x86-64

zarrs-0.1.5-cp311-abi3-musllinux_1_2_armv7l.whl (10.0 MB view details)

Uploaded CPython 3.11+musllinux: musl 1.2+ ARMv7l

zarrs-0.1.5-cp311-abi3-musllinux_1_2_aarch64.whl (10.1 MB view details)

Uploaded CPython 3.11+musllinux: musl 1.2+ ARM64

zarrs-0.1.5-cp311-abi3-manylinux_2_28_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.28+ x86-64

zarrs-0.1.5-cp311-abi3-manylinux_2_28_ppc64le.whl (4.4 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.28+ ppc64le

zarrs-0.1.5-cp311-abi3-manylinux_2_28_armv7l.whl (3.9 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.28+ ARMv7l

zarrs-0.1.5-cp311-abi3-manylinux_2_28_aarch64.whl (4.0 MB view details)

Uploaded CPython 3.11+manylinux: glibc 2.28+ ARM64

zarrs-0.1.5-cp311-abi3-macosx_11_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.11+macOS 11.0+ ARM64

zarrs-0.1.5-cp311-abi3-macosx_10_12_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.11+macOS 10.12+ x86-64

File details

Details for the file zarrs-0.1.5.tar.gz.

File metadata

  • Download URL: zarrs-0.1.5.tar.gz
  • Upload date:
  • Size: 59.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for zarrs-0.1.5.tar.gz
Algorithm Hash digest
SHA256 7170a5d96e4f38000853495e338451e0df92db84b5027fc42d6b37fff1abd6f3
MD5 c6afaffedf69c85600875dc9e180c80d
BLAKE2b-256 95570122080bd3f7aec466f7f7d3e754b760c194e7c63708a09fcbeff97a0cc6

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5.tar.gz:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-win_amd64.whl.

File metadata

  • Download URL: zarrs-0.1.5-cp311-abi3-win_amd64.whl
  • Upload date:
  • Size: 3.8 MB
  • Tags: CPython 3.11+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 9dfff1be8b2d4fe57ec84aaf741d1d37c403301771f5968d5a30805dd5ebe9d9
MD5 d0c0dcdeb51100ab554f5a66673b2897
BLAKE2b-256 3a067f412aec152a69d82da1509aef90de489a8e03f191b0c0e081bf68e464a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-win_amd64.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-win32.whl.

File metadata

  • Download URL: zarrs-0.1.5-cp311-abi3-win32.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.11+, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-win32.whl
Algorithm Hash digest
SHA256 78db1fb9d9e1ee57ec9a8ceaa43b75f723b9621573e46d9c90a7380ed11e643a
MD5 155dea5706ad95f68d84f58470fa81a3
BLAKE2b-256 a5c823525bff7d341e787a9d5c8686fa0b58386edb96f508c79d6083f14f3b4b

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-win32.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 8a00a55928a3bd3a4b8b86a2343cf804f47626d21d749a5c65b0daae605e9363
MD5 027d86dcbbcec0fa23f57f90a4c6064d
BLAKE2b-256 d7e99e24cfce3ae3f552afd89f5321f5e0321b94f43be9c7365f4da552528119

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-musllinux_1_2_x86_64.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-musllinux_1_2_armv7l.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-musllinux_1_2_armv7l.whl
Algorithm Hash digest
SHA256 23ec6c1aabc92652a9eb1354c05a7ff222c562d7c9daec239209789ebd7ee5e0
MD5 242e4989185e83c70838d92f06b2795b
BLAKE2b-256 f22b9caeabb618060be86dce80c289db2144ee2bc849cbe1e4c739e3a7566990

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-musllinux_1_2_armv7l.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 555b28fa539246bc351f8c0dab82bdf1a4d30c9a786412796096fe46fd22b981
MD5 c4a9bf484080a99e41f38d6e7e9e1453
BLAKE2b-256 ba13a07ffccc3b162d9ff2c42857731870b884f9fc3c0d046a1da358018c3392

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-musllinux_1_2_aarch64.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7d6c7a05cd74ff909cc9f5a8fd173d7694891e7da04003eaedf01762b5fc93ec
MD5 719192feedc52e77219f7e1a49311bf8
BLAKE2b-256 247819b411c80e85267f90ac649b605c0e11431064533646c24747c788b37039

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-manylinux_2_28_x86_64.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-manylinux_2_28_ppc64le.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-manylinux_2_28_ppc64le.whl
Algorithm Hash digest
SHA256 ad4857d96805e2bcf65303536b53735918ce652e717dc49b44be95be3441e8dd
MD5 1d51c0a318921ba85c4b14fefae61cb9
BLAKE2b-256 f24ff1e7d9cf9a68dc86ca1f4c45c116c67f093c660b15ff94d5520218c28a09

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-manylinux_2_28_ppc64le.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-manylinux_2_28_armv7l.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-manylinux_2_28_armv7l.whl
Algorithm Hash digest
SHA256 3619d8aff4c85e0601501c798323bb7e8a13b13013e8bd549751ca01ecaa3ab2
MD5 56e78467294ed589822466dd5d0885a8
BLAKE2b-256 48a0bd2b392b13264693898f6c16130d24f5f1680d5a7984d40c0a03527101db

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-manylinux_2_28_armv7l.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 e15d9f6dcbf64ec0b23a3d8581469ccab1d8418089d65d0c60f2b6a902e9d073
MD5 57b630448a8d795b223517430d088603
BLAKE2b-256 efe1161ba25c82b0c6ac5cd9fa985349c0c458437961a55a40373d30d5103db1

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-manylinux_2_28_aarch64.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2ca7f4a4abc52834e92ddc28164ab12f049ab577a2200bc4536f6211a5ae091a
MD5 94264db702e30b5a7565f8b59d8294f7
BLAKE2b-256 585125a8e7af2c693213c90387f45e2e1d8e5393cabd4148a87e03237f986676

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-macosx_11_0_arm64.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file zarrs-0.1.5-cp311-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for zarrs-0.1.5-cp311-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3991468f83754903824411f43a4d94468b3520b83f1d32112766e4c40523b06d
MD5 e50eb8951dd0f16a3ac21bb4399e2cea
BLAKE2b-256 4ad1f1bb9c476f400da30c125ecf4eb0b1321431da8af1402a38208fa12dc84d

See more details on using hashes here.

Provenance

The following attestation bundles were made for zarrs-0.1.5-cp311-abi3-macosx_10_12_x86_64.whl:

Publisher: cd.yml on zarrs/zarrs-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page