Skip to main content

Yet another ome-zarr model.

Project description

yaozarrs ‼️

License PyPI Python Version CI codecov

Yet Another Ome-ZARR Schema!

Oh no, not another one 🤦

First, let me apologize. The last thing the world needs is yet another ome-zarr model. However, I was unable to find a minimal ome-zarr model that simply represents the spec, without introducing additional I/O features or dependencies. Please read the Existing Projects section for more context.

Installation

pip install yaozarrs

# or, to load/validate local/remote zarr stores:
pip install yaozarrs[io]

Usage

Here are some things you can do with yaozarrs.

  1. Construct valid ome-zarr JSON documents for creating ome-zarr groups
  2. Validate & load existing JSON documents
  3. Validate arbitrary python objects as an OME-NGFF object
  4. Validate any zarr store using the CLI
  5. Validate any zarr store programmatically
  6. Open zarr arrays using zarr-python or tensorstore

Construct valid ome-zarr JSON documents for creating ome-zarr groups

This is useful if you are creating OME-Zarr files directly. Since this package has no dependencies beyond pydantic, it allows downstream projects to use a common model, without enforcing a specific mechanism for data I/O (e.g. using zarr, tensorstore, acquire-zarr, etc),

from yaozarrs import v05
from pathlib import Path

scale = v05.Multiscale(
    name="scale0",
    axes=[v05.SpaceAxis(name="x", type="space"), v05.SpaceAxis(name="y", type="space")],
    datasets=[
        v05.Dataset(
            path="0",
            coordinateTransformations=[v05.ScaleTransformation(scale=[1, 1])],
        ),
        v05.Dataset(
            path="1",
            coordinateTransformations=[v05.ScaleTransformation(scale=[1, 1])],
        ),
    ],
)

img = v05.Image(multiscales=[scale])
zarr_json = v05.OMEZarrGroupJSON(attributes={"ome": img})
json_data = zarr_json.model_dump_json(exclude_unset=True)
Path("zarr.json").write_text(json_data)

Validate & load existing JSON documents

If you have an existing JSON document, you can validate and load it, and benefit from IDE autocompletion and type hints.

from pathlib import Path
import yaozarrs

json_string = Path("zarr.json").read_text()
obj = yaozarrs.validate_ome_json(json_string)

# OMEZarrGroupJSON(
#     zarr_format=3,
#     node_type='group',
#     attributes=OMEAttributes(
#         ome=Image(
#             version='0.5',
#             multiscales=[
#                 Multiscale(
#                     name='scale0',
#                     axes=[SpaceAxis(name='x', type='space', unit=None), SpaceAxis(name='y', type='space', unit=None)],
#                     datasets=[
#                         Dataset(path='0', coordinateTransformations=[ScaleTransformation(type='scale', scale=[0.0, 1.0])]),
#                         Dataset(path='1', coordinateTransformations=[ScaleTransformation(type='scale', scale=[0.0, 1.0])])
#                     ],
#                     coordinateTransformations=None,
#                     type=None,
#                     metadata=None
#                 )
#             ],
#             omero=None
#         )
#     )
# )

Validate arbitrary python objects as an OME-NGFF object

validate_ome_object and validate_ome_json accept a broad range of inputs, and will cast to an appropriate model if possible.

import yaozarrs

obj = yaozarrs.validate_ome_object(
  {'version': '0.5', 'series': ["0", "1"]}
)
print(obj)
# Series(version='0.5', series=['0', '1'])

Validate any zarr store using the CLI

[!IMPORTANT]
Requires fsspec. install with pip install yaozarrs[io]

The CLI command provides a quick way to validate any zarr store as an OME-Zarr store. Here, "store" here refers to any URI (local path, http(s) url, s3 url, etc) or a zarr-python zarr.Group.

$ yaozarrs validate https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr
✓ Valid OME-Zarr store
  Version: 0.5
  Type: Image

[!TIP]
Use uvx for quick validation of any URI, without pip installing the package.

uvx "yaozarrs[io]" validate https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr

Storage Validation Errors

Validation errors that relate to the structure of the OME-Zarr itself (as opposed to metadata) are collected and presented similarly to pydantic validation errors for the metadata:

location
  description [context]

An example validation error (for a file that has many problems):

uvx "yaozarrs[io]" validate https://raw.githubusercontent.com/tlambert03/yaozarrs/refs/heads/main/tests/data/broken/broken_v05.ome.zarr/
yaozarrs._storage.StorageValidationError: 14 validation error(s) for StorageValidationError
ome.plate.wells.0.well.images.0.multiscales.0.datasets.0.path
  Dataset path '0' not found in zarr group [type=dataset_path_not_found, fs_path='broken_v05.ome.zarr/A/1/0/0', expected='zarr array']

ome.plate.wells.0.well.images.0.labels.labels.0
  Label path 'annotations' not found in labels group [type=label_path_not_found, fs_path='broken_v05.ome.zarr/A/1/0/labels/annotations', expected='zarr group']

ome.plate.wells.0.well.images.1.labels.labels.0
  Label path 'annotations' is not a zarr group [type=label_path_not_group, fs_path='broken_v05.ome.zarr/A/1/1/labels/annotations', expected='group', found='array']

ome.plate.wells.1.path
  Well path 'A/2' is not a zarr group [type=well_path_not_group, fs_path='broken_v05.ome.zarr/A/2', expected='group', found='array']

ome.plate.wells.2.well.images.0.labels.labels.0
  Label path 'annotations' does not contain valid Image ('multiscales') metadata [type=label_image_invalid, path='annotations']
  1 validation error for tagged-union[LabelImage,Image,Plate,Bf2Raw,Well,LabelsGroup,Series]
    Unable to extract tag using discriminator _discriminate_ome_v05_metadata() [type=union_tag_not_found, input_value={}, input_type=dict]
      For further information visit https://errors.pydantic.dev/2.12/v/union_tag_not_found

ome.plate.wells.3.well.images.0.multiscales.0.datasets.0.path
  Dataset '0' has 5 dimensions but axes specify 3 [type=dataset_dimension_mismatch, fs_path='broken_v05.ome.zarr/B/1/0/0', actual_ndim=5, expected_ndim=3, axes=['c', 'y', 'x']]

ome.plate.wells.3.well.images.0.labels.labels.0.multiscales.0.datasets.0.path
  Label array '0' has non-integer dtype 'float32'. Labels must use integer types. [type=label_non_integer_dtype, path='0', dtype='float32']

ome.plate.wells.4.well.images.0.multiscales.0.datasets.0.path
  Dataset path '0' exists but is not a zarr array [type=dataset_not_array, fs_path='broken_v05.ome.zarr/B/2/0/0', expected='array', found='group']

ome.plate.wells.4.well.images.1.multiscales.0.datasets.0.path.dimension_names
  Array dimension_names ['wrong', 'names', 'here'] don't match axes names ['c', 'y', 'x'] [type=dimension_names_mismatch, expected=['c', 'y', 'x'], actual=['wrong', 'names', 'here']]

ome.plate.wells.5.well.images.0.labels
  Found 'labels' path but it is a <class 'yaozarrs._zarr.ZarrArray'>, not a zarr group [type=labels_not_group, expected='group', found='ZarrArray']

ome.plate.wells.5.well.images.1.path
  Field path '1' is not a zarr group [type=field_path_not_group, fs_path='broken_v05.ome.zarr/B/3/1', expected='group', found='array']

ome.plate.wells.6.well.images.1.path
  Field path '1' not found in well group [type=field_path_not_found, fs_path='broken_v05.ome.zarr/C/1/1', expected='zarr group']

ome.plate.wells.7.well.images.0
  Field path '0' does not contain valid Image metadata [type=field_image_invalid, fs_path='broken_v05.ome.zarr/C/2/0']
  1 validation error for tagged-union[LabelImage,Image,Plate,Bf2Raw,Well,LabelsGroup,Series]
  image.multiscales
    Value should have at least 1 item after validation, not 0 [type=too_short, input_value=[], input_type=list]
      For further information visit https://errors.pydantic.dev/2.12/v/too_short

ome.plate.wells.8
  Well path 'C/3' does not contain valid Well metadata [type=well_invalid, path='C/3']
  1 validation error for tagged-union[LabelImage,Image,Plate,Bf2Raw,Well,LabelsGroup,Series]
    Unable to extract tag using discriminator _discriminate_ome_v05_metadata() [type=union_tag_not_found, input_value={}, input_type=dict]
      For further information visit https://errors.pydantic.dev/2.12/v/union_tag_not_found

Validate any zarr store programmatically

[!IMPORTANT]
Requires fsspec. install with pip install yaozarrs[io]

import yaozarrs

yaozarrs.validate_zarr_store("https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr")

Open zarr arrays using zarr-python or tensorstore

[!IMPORTANT]

  • to_tensorstore() requires tensorstore
  • to_zarr_python() requires zarr

This package does not depend on zarr or tensorstore, even for validating OME-Zarr stores. (It uses a minimal representation of a zarr group internally, backed by fsspec.) If you would like to actually open arrays, you can use either zarr or tensorstore directly.

from yaozarrs import open_group

group = open_group("https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr")
array = group['0']
# <ZarrArray https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.5/idr0062A/6001240_labels.zarr/0>

# read bytes using tensorstore or zarr-python:
ts_array = array.to_tensorstore() # isinstance(ts_array, tensorstore.TensorStore)
zarr_array = array.to_zarr_python() # isinstance(zarr_array, zarr.Array)

# inspect the OME metadata associated with the group:
print(group.ome_metadata())
# Image(
#     version='0.5',
#     multiscales=[
#         Multiscale(
#             name=None,
#             axes=[
#                 ChannelAxis(name='c', type='channel', unit=None),
#                 SpaceAxis(
#                     name='z',
#                     type='space',
#                     unit='micrometer'
#                 ),
#                 SpaceAxis(
#                     name='y',
#                     type='space',
#                     unit='micrometer'
#                 ),
#                 SpaceAxis(
#                     name='x',
#                     type='space',
#                     unit='micrometer'
#                 )
#             ],
#             datasets=[
#                 Dataset(
#                     path='0',
#                     coordinateTransformations=[
#                         ScaleTransformation(
#                             type='scale',
#                             scale=[
#                                 1.0,
#                                 0.5002025531914894,
#                                 0.3603981534640209,
#                                 0.3603981534640209
#                             ]
#                         )
#                     ]
#                 ),
#                 Dataset(
#                     path='1',
#                     coordinateTransformations=[
#                         ScaleTransformation(
#                             type='scale',
#                             scale=[
#                                 1.0,
#                                 0.5002025531914894,
#                                 0.7207963069280418,
#                                 0.7207963069280418
#                             ]
#                         )
#                     ]
#                 ),
#                 Dataset(
#                     path='2',
#                     coordinateTransformations=[
#                         ScaleTransformation(
#                             type='scale',
#                             scale=[
#                                 1.0,
#                                 0.5002025531914894,
#                                 1.4415926138560835,
#                                 1.4415926138560835
#                             ]
#                         )
#                     ]
#                 )
#             ],
#             coordinateTransformations=None,
#             type=None,
#             metadata=None
#         )
#     ],
#     omero=Omero(
#         channels=[
#             OmeroChannel(
#                 window=OmeroWindow(
#                     start=0.0,
#                     min=0.0,
#                     end=1500.0,
#                     max=65535.0
#                 ),
#                 label='LaminB1',
#                 family='linear',
#                 color='0000FF',
#                 active=True,
#                 inverted=False,
#                 coefficient=1.0
#             ),
#             OmeroChannel(
#                 window=OmeroWindow(
#                     start=0.0,
#                     min=0.0,
#                     end=1500.0,
#                     max=65535.0
#                 ),
#                 label='Dapi',
#                 family='linear',
#                 color='FFFF00',
#                 active=True,
#                 inverted=False,
#                 coefficient=1.0
#             )
#         ],
#         id=1
#     )
# )

Existing Projects

You should first check these existing packages to see if they meet your needs:

  • ome-zarr-models-py.
    This project has garnered strong community attention and aligns well with many use cases.
    For my particular goals, I found a few things diverged from what I need.

    1. It offers convenient I/O helpers (based on and requiring zarr-python) that are great in many contexts, but I wanted to explore a version with no I/O assumptions – just classes mirroring the schema – without the zarr dep.

      There are issues & PRs to this effect:

      but since ome-zarr-models-py also depends on pydantic-zarr, that library will also need to be modified to remove the zarr dependency.

    2. It currently pins to Python 3.11+ (presumably following NEP-29/SPEC-0), whereas I prefer to match the official python EOL schedule (supporting 3.10 until mid 2026).

    3. Its inheritance and generics provide powerful abstractions, though for my experiments I wanted something simpler that just mirrors the spec.

    Ideally, this kind of minimal approach could help inform future directions for ome-zarr-models-py, and I’d be glad to see ideas converge over time.

  • pydantic-ome-ngff. Deprecated.

  • ngff-zarr. This also contains models, but brings along far more dependencies and assumptions (and functionality) than ome-zarr-models-py.

  • ome-zarr. This is a general toolkit, that provides functions for reading and writing OME-ZARR, among other things, but brings in many dependencies (zarr, scikit-image, dask,...) and doesn't export metadata models.

In the meantime:

This is an experimental package, where I can develop minimal models for my applications. The hope would be some future unification, provided the community can agree on a common denominator of features.

Ultimately, I want a schema-first, I/O-second library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yaozarrs-0.0.1rc2.tar.gz (88.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yaozarrs-0.0.1rc2-py3-none-any.whl (55.3 kB view details)

Uploaded Python 3

File details

Details for the file yaozarrs-0.0.1rc2.tar.gz.

File metadata

  • Download URL: yaozarrs-0.0.1rc2.tar.gz
  • Upload date:
  • Size: 88.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for yaozarrs-0.0.1rc2.tar.gz
Algorithm Hash digest
SHA256 15c6d7b3aad0bdb88ac59fa19be145bc75d2fd684ac933c283c32a8d469e1980
MD5 cc37d88f2f96f195bc31a88a48e712db
BLAKE2b-256 6f3cf1d3475bebed5f4cad8b9e851315b10c107ccc1050cf25b9fdbf0ebe9096

See more details on using hashes here.

Provenance

The following attestation bundles were made for yaozarrs-0.0.1rc2.tar.gz:

Publisher: ci.yml on tlambert03/yaozarrs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file yaozarrs-0.0.1rc2-py3-none-any.whl.

File metadata

  • Download URL: yaozarrs-0.0.1rc2-py3-none-any.whl
  • Upload date:
  • Size: 55.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for yaozarrs-0.0.1rc2-py3-none-any.whl
Algorithm Hash digest
SHA256 c8478b0b463518b27c73550e5f2b7fcc181902a9f5b1aa33da273804d23ced0b
MD5 cf87dd32853b98fb7dc3238952b48b45
BLAKE2b-256 92188afdaf886f2836b7b1f8d83366f510eb5acb261f7387c40b6b8355252cd9

See more details on using hashes here.

Provenance

The following attestation bundles were made for yaozarrs-0.0.1rc2-py3-none-any.whl:

Publisher: ci.yml on tlambert03/yaozarrs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page