A package for converting datasets to OME-Zarr format.
Project description
EuBI-Bridge
EuBI-Bridge is a tool for distributed conversion of microscopic image collections into the OME-Zarr format. It can run on the command line or as part of a Python script.
A key feature of EuBI-Bridge is aggregative conversion, which concatenates multiple images along specified dimensions—particularly useful for handling large datasets stored as TIFF file collections.
EuBI-Bridge is built on several powerful libraries, including zarr, bioio, dask and tensorstore, among others.
Relying on bioio plugins for reading, EuBI-Bridge supports a wide range of input file formats.
Installation
The recommended way to install EuBI-Bridge is via pip. Create a virtual environment with Python 3.11 or 3.12 and use pip to install EuBI-Bridge as shown below:
python -m venv venv # Python must be either version 3.11 or 3.12.
source venv/bin/activate
pip install 'eubi-bridge[all]==0.1.0c7' # installs both GUI and CLI
# OR
# pip install 'eubi-bridge[cli]==0.1.0c7' # installs only CLI
# pip install 'eubi-bridge[gui]==0.1.0c7' # installs only GUI
# pip install 'eubi-bridge==0.1.0c7' # installs as a Python library, without GUI or CLI utilities.
#
# If a previous version of eubi-bridge was installed before, reset the configuration:
eubi reset_config
Important: EuBI-Bridge is currently only compatible with Python 3.11 or 3.12 due to conflicting dependencies. We are working on supporting a wider range of Python versions in future releases.
If your default Python is different from version 3.11 or 3.12, create a conda environment with one of these Python versions:
mamba create -n eubizarr openjdk=11.* maven python=3.12
Then install EuBI-Bridge via pip in the conda environment:
conda activate eubizarr
pip install --no-cache-dir 'eubi-bridge[all]==0.1.0c7'
# If a previous version of eubi-bridge was installed before, reset the configuration:
eubi reset_config
Troubleshooting
If you receive a Building wheel error such as:
Building wheel for ... error
error: subprocess-exited-with-error
× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
then try the following:
# In the `eubizarr` environment
mamba install cmake zlib boost # preinstall dependencies that can help build from source
pip install --no-cache-dir "eubi-bridge[all]==0.1.0c7" # try installing again with the dependencies available
# If a previous version of eubi-bridge was installed before, reset the configuration:
eubi reset_config
Documentation
Find the documentation for EuBI-Bridge here
Basic Usage
Unary Conversion
Given a dataset structured as follows:
multichannel_timeseries
├── Channel1-T0001.tif
├── Channel1-T0002.tif
├── Channel1-T0003.tif
├── Channel1-T0004.tif
├── Channel2-T0001.tif
├── Channel2-T0002.tif
├── Channel2-T0003.tif
└── Channel2-T0004.tif
To convert each TIFF into a separate OME-Zarr container (unary conversion):
eubi to_zarr multichannel_timeseries multichannel_timeseries_zarr
Use the argument --zarr_format to specify the zarr format version to use.
To create a zarr version 3 dataset, use --zarr_format 3:
eubi to_zarr multichannel_timeseries multichannel_timeseries_zarr --zarr_format 3
Both of these commands will perform unary conversion, resulting in the following output:
multichannel_timeseries_zarr
├── Channel1-T0001.zarr
├── Channel1-T0002.zarr
├── Channel1-T0003.zarr
├── Channel1-T0004.zarr
├── Channel2-T0001.zarr
├── Channel2-T0002.zarr
├── Channel2-T0003.zarr
└── Channel2-T0004.zarr
Use wildcards to specifically convert the images belonging to Channel1:
eubi to_zarr "multichannel_timeseries/Channel1*" multichannel_timeseries_channel1_zarr
Aggregative Conversion (Concatenation Along Dimensions)
To concatenate images along specific dimensions, EuBI-Bridge needs to be informed
of file patterns that specify image dimensions. For this example,
the file pattern for the channel dimension is Channel, which is followed by the channel index,
and the file pattern for the time dimension is T, which is followed by the time index.
To concatenate along the time dimension:
eubi to_zarr multichannel_timeseries multichannel_timeseries_concat_zarr \
--channel_tag Channel \
--time_tag T \
--concatenation_axes t
Output:
multichannel_timeseries_time-concat_zarr
├── Channel1-T_tset.zarr
└── Channel2-T_tset.zarr
Important note: if the --channel_tag was not provided, the tool would not be aware
of the multiple channels in the image and try to concatenate all images into a single one-channeled OME-Zarr. Therefore,
when an aggregative conversion is performed, all dimensions existing in the input files must be specified via their respective tags.
For multidimensional concatenation (channel + time):
eubi to_zarr multichannel_timeseries multichannel_timeseries_concat_zarr \
--channel_tag Channel \
--time_tag T \
--concatenation_axes ct
Note that both axes are specified wia the argument --concatenation_axes ct.
Output:
multichannel_timeseries_concat_zarr
└── Channel_cset-T_tset.zarr
Handling Nested Directories
For datasets stored in nested directories such as:
multichannel_timeseries_nested
├── Channel1
│ ├── T0001.tif
│ ├── T0002.tif
│ ├── T0003.tif
│ ├── T0004.tif
├── Channel2
│ ├── T0001.tif
│ ├── T0002.tif
│ ├── T0003.tif
│ ├── T0004.tif
EuBI-Bridge automatically detects the nested structure. To concatenate along both channel and time dimensions:
eubi to_zarr \
multichannel_timeseries_nested \
multichannel_timeseries_nested_concat_zarr \
--channel_tag Channel \
--time_tag T \
--concatenation_axes ct
Output:
multichannel_timeseries_nested_concat_zarr
└── Channel_cset-T_tset.zarr
To concatenate along the channel dimension only:
eubi to_zarr \
multichannel_timeseries_nested \
multichannel_timeseries_nested_concat_zarr \
--channel_tag Channel \
--time_tag T \
--concatenation_axes c
Output:
multichannel_timeseries_nested_concat_zarr
├── Channel_cset-T0001.zarr
├── Channel_cset-T0002.zarr
├── Channel_cset-T0003.zarr
└── Channel_cset-T0004.zarr
Selective Data Conversion
To recursively select specific files for conversion, wildcard patterns can be used. For example, to concatenate only timepoint 3 along the channel dimension:
eubi to_zarr \
"multichannel_timeseries_nested/**/*T0003*" \
multichannel_timeseries_nested_concat_zarr \
--channel_tag Channel \
--time_tag T \
--concatenation_axes c
Output:
multichannel_timeseries_nested_concat_zarr
└── Channel_cset-T0003.zarr
Note: When using wildcards, the input directory path must be enclosed in quotes as shown in the example above.
Handling Categorical Dimension Patterns
For datasets where channel names are categorical such as in:
blueredchannel_timeseries
├── Blue-T0001.tif
├── Blue-T0002.tif
├── Blue-T0003.tif
├── Blue-T0004.tif
├── Red-T0001.tif
├── Red-T0002.tif
├── Red-T0003.tif
└── Red-T0004.tif
Specify categorical names as a comma-separated list:
eubi to_zarr \
blueredchannels_timeseries \
blueredchannels_timeseries_concat_zarr \
--channel_tag Blue,Red \
--time_tag T \
--concatenation_axes ct
Output:
blueredchannels_timeseries_concat_zarr
└── BlueRed_cset-T_tset.zarr
Note that the categorical names are aggregated in the output OME-Zarr name.
With nested input structure such as in:
blueredchannels_timeseries_nested
├── Blue
│ ├── T0001.tif
│ ├── T0002.tif
│ ├── T0003.tif
│ └── T0004.tif
└── Red
├── T0001.tif
├── T0002.tif
├── T0003.tif
└── T0004.tif
One can run the exact same command:
eubi to_zarr \
blueredchannels_timeseries_nested \
blueredchannels_timeseries_nested_concat_zarr \
--channel_tag Blue,Red \
--time_tag T \
--concatenation_axes ct
Output:
blueredchannels_timeseries_nested_concat_zarr
└── BlueRed_cset-T_tset.zarr
Additional Notes
- EuBI-Bridge is in the beta stage, and significant updates may be expected.
- Community support: Questions and contributions are welcome! Please report any issues.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file eubi_bridge-0.1.0rc8.tar.gz.
File metadata
- Download URL: eubi_bridge-0.1.0rc8.tar.gz
- Upload date:
- Size: 42.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a0bb03110f689f8b168da478c44bc4bf168f22b452239311dad620f6b03e5982
|
|
| MD5 |
e42366f46e21617bc3a9a7afba048798
|
|
| BLAKE2b-256 |
72e3dc0b346355d6f3306f863294620598b5e42319cdcc056b6f0e3d92a0c4da
|