Open bioimage dataset catalog for benchmarking image IO, transformations, metadata management, and bioimage-linked workflows.

These details have not been verified by PyPI

Project description

OME-IRIS

OME-IRIS is an open bioimage dataset catalog for benchmarking image input/output (IO), transformations, metadata management, and bioimage-linked workflows.

We also provide a small Python package by the same name (ome_iris) to help fetch and validate the datasets in the catalog.

Inspired by both the classic iris.csv dataset and the iris of the eye that brings images into focus, OME-IRIS aims to provide a collection of reference datasets for evaluating interoperable bioimage data formats, tools, and workflows.

What this is

A lightweight manifest catalog for small benchmark datasets
A fetch + verify workflow with a single CLI
LinkML-based schema definitions for dataset manifests

What this is not

Not a data portal
Not DVC-based
Not a large-file git storage approach
Not a full ontology or end-to-end benchmark system yet

Quick start

uv run ome-iris fetch --tier small
uv run ome-iris verify
uv run ome-iris export-rocrate --dataset nf1-cellpainting-shrunken

Download a reproducible subset for local development or benchmarking:

uv run ome-iris download nf1 \
  --output .benchmark-data/ome-iris/nf1 \
  --preset tiny \
  --channel DAPI

Python API:

from ome_iris import datasets

datasets.download(
    "nf1",
    output_dir=".benchmark-data/ome-iris/nf1",
    subset={"images": 20, "channels": ["DAPI"]},
)

Fetch output modes:

uv run ome-iris fetch --tier small --verbose  # show per-file labels + downloader progress
uv run ome-iris fetch --tier small --silent   # suppress downloader progress output

What `fetch` does

High-level flow when you run ome-iris fetch:

Loads dataset manifests from --manifests-dir.
Applies optional filters (--dataset, --tier).
Creates local dataset roots under --data-dir/<source_identifier>/.
Writes ro-crate-metadata.json into each dataset root.
Iterates over each files entry:
- for kind: file: downloads the file URL (or skips if already present)
- for kind: directory: traverses/downloads directory contents (or extracts archive sources)
Reports a summary:
- downloaded count + item list
- skipped count + item list
- missing URLs
- failed downloads

Output layout example:

data/
  NF1_cellpainting_data_shrunken/
    ro-crate-metadata.json
    profiles.parquet
    images/
    masks/

Local files are stored under ./data/ by default. Each dataset directory also gets ro-crate-metadata.json with source/provenance metadata from the manifest.

To use another data directory:

uv run ome-iris fetch --data-dir /tmp/ome-iris-data
uv run ome-iris verify --data-dir /tmp/ome-iris-data

What `download` does

ome-iris download creates a small, reproducible subset under the exact --output directory. It supports named dataset aliases such as nf1, preset sizes (tiny, small, benchmark), image limits, channel filters, plate/well/site filters, and Z/T/C ranges where filenames expose those values.

Downloaded subsets include manifest.json with the source dataset, selected subset options, downloaded file paths, source URLs, SHA-256 checksums, file sizes, image shapes, dtypes, and file metadata. Existing files are reused and included in the manifest. Use --validate-only to verify an existing subset cache against its manifest without downloading data:

uv run ome-iris download nf1 \
  --output .benchmark-data/ome-iris/nf1 \
  --validate-only

Add a dataset

Add or update a dataset manifest and catalog metadata.
Include source, formats, and file-level metadata.
Run:

uv run ome-iris verify

Starter scaffolding command:

uv run ome-iris scaffold --source-path /path/to/JUMP_plate_BR00117006
uv run ome-iris scaffold --source-path /path/to/JUMP_plate_BR00117006 --append-csv
uv run ome-iris scaffold --source-path /path/to/JUMP_plate_BR00117006 --include-directory-entry --directory-path images --archive-format zip

The command guesses a dataset id/name/formats, writes a starter YAML manifest, and prints a suggested datasets.csv row.

File entry patterns

source_identifier is required at the top level of each manifest.
All files[].path values are relative to data/<source_identifier>/.
sha256 is optional for file entries.
Use kind: directory to fetch everything under a directory source.
- For GitHub tree URLs (https://github.com/<owner>/<repo>/tree/<ref>/<path>), OME-IRIS traverses files under that subtree.
- For local directory paths, OME-IRIS recursively copies files.
- For archive URLs, set archive_format (zip or tar) to extract an archive into the destination directory.

Relationships

Use an optional top-level relationships list to describe links between dataset components.

from: source file path (must match a files[].path)
to: target file path (must match a files[].path)
type: relationship label (for example links_to_images_by, links_to_masks_by, references_metadata)
rocrate_predicate: explicit RO-Crate/JSON-LD predicate URI for export (required)
via_columns (optional): explicit table columns used for linking
filename_patterns (optional): standardized filename templates used by the relationship
derived_from_columns (optional): columns used when deriving one component from another (for example images -> masks)

Example:

files:
  - path: profiles.parquet
  - path: images
    kind: directory

relationships:
  - from: profiles.parquet
    to: images
    type: links_to_images_by
    rocrate_predicate: http://schema.org/associatedMedia

Example directory entry:

files:
  - path: jump-plate/images
    kind: directory
    archive_format: zip
    url: https://example.org/jump-plate-images.zip
    sha256: ""  # optional

Custom metadata (first-class)

OME-IRIS supports custom metadata as a first-class field via custom_metadata objects at manifest, source, and file levels.

Rules:

custom_metadata must be an object/map.
Keys must be strings.
Values may be strings, numbers, booleans, null, lists, or nested objects.

Example:

id: jump-plate
source_identifier: JUMP_plate_BR00117006
name: JUMP plate BR00117006 (JUMP_plate_BR00117006) example
description: Plate-level cell painting benchmark subset.
tier: small
license: CC-BY-4.0
custom_metadata:
  study: jump-cp
  species: human
source:
  repository: https://example.org/repo
  path: datasets/JUMP_plate_BR00117006
  url: https://example.org/repo/tree/main/datasets/JUMP_plate_BR00117006
formats: [csv, tiff]
files:
  - path: profiles.csv
    url: https://example.org/files/profiles.csv
    sha256: "..."
    custom_metadata:
      role: profile_table

Why large files are not committed

Large image/profile files make repositories slow and fragile for contributors and CI. OME-IRIS tracks metadata and download locations, while actual data is fetched locally when needed.

Documentation

Build docs locally:

uv sync --group docs
uv run --frozen sphinx-build docs/src docs/build

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.5

Jun 4, 2026

0.0.4

May 30, 2026

0.0.3

May 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ome_iris-0.0.5.tar.gz (113.0 kB view details)

Uploaded Jun 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ome_iris-0.0.5-py3-none-any.whl (43.2 kB view details)

Uploaded Jun 4, 2026 Python 3

File details

Details for the file ome_iris-0.0.5.tar.gz.

File metadata

Download URL: ome_iris-0.0.5.tar.gz
Upload date: Jun 4, 2026
Size: 113.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ome_iris-0.0.5.tar.gz
Algorithm	Hash digest
SHA256	`34faf69627c4acef3f1134cea93d9ad8f12b34a288c6cd00fed4d962bd98979a`
MD5	`d70d733ec8ba1d067d1ffd07f7867452`
BLAKE2b-256	`9572b4d934c39c874a26616f50685eeb77c13f5085c2d0e451c861053e31182f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ome_iris-0.0.5.tar.gz:

Publisher: publish-pypi.yml on d33bs/OME-IRIS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ome_iris-0.0.5.tar.gz
- Subject digest: 34faf69627c4acef3f1134cea93d9ad8f12b34a288c6cd00fed4d962bd98979a
- Sigstore transparency entry: 1722357866
- Sigstore integration time: Jun 4, 2026
Source repository:
- Permalink: d33bs/OME-IRIS@93fb3ff248d1e96f770ae50573d32c45cdc11583
- Branch / Tag: refs/tags/v0.0.5
- Owner: https://github.com/d33bs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@93fb3ff248d1e96f770ae50573d32c45cdc11583
- Trigger Event: release

File details

Details for the file ome_iris-0.0.5-py3-none-any.whl.

File metadata

Download URL: ome_iris-0.0.5-py3-none-any.whl
Upload date: Jun 4, 2026
Size: 43.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ome_iris-0.0.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6baaa956f0558d32bf4ad821768dccb1de8e4bed549d8767b5b758cca778ec04`
MD5	`5e1d72c9cefd7b7015a4e6fa771926d4`
BLAKE2b-256	`d3f18498d39e492c54e50c19a1ec75d9beb77b022bc760032e2749ec3350a24f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ome_iris-0.0.5-py3-none-any.whl:

Publisher: publish-pypi.yml on d33bs/OME-IRIS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ome_iris-0.0.5-py3-none-any.whl
- Subject digest: 6baaa956f0558d32bf4ad821768dccb1de8e4bed549d8767b5b758cca778ec04
- Sigstore transparency entry: 1722357998
- Sigstore integration time: Jun 4, 2026
Source repository:
- Permalink: d33bs/OME-IRIS@93fb3ff248d1e96f770ae50573d32c45cdc11583
- Branch / Tag: refs/tags/v0.0.5
- Owner: https://github.com/d33bs
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-pypi.yml@93fb3ff248d1e96f770ae50573d32c45cdc11583
- Trigger Event: release

ome-iris 0.0.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

OME-IRIS

What this is

What this is not

Quick start

What `fetch` does

What `download` does

Add a dataset

File entry patterns

Relationships

Custom metadata (first-class)

Why large files are not committed

Documentation

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

ome-iris 0.0.5

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

OME-IRIS

What this is

What this is not

Quick start

What fetch does

What download does

Add a dataset

File entry patterns

Relationships

Custom metadata (first-class)

Why large files are not committed

Documentation

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

What `fetch` does

What `download` does