Skip to main content

Generate a multiscale, chunked, multi-dimensional spatial image data structure that can be serialized to OME-NGFF.

Project description

multiscale-spatial-image

Test Notebook tests image image DOI

Generate a multiscale, chunked, multi-dimensional spatial image data structure that can serialized to OME-NGFF.

Each scale is a scientific Python Xarray spatial-image Dataset, organized into nodes of an Xarray Datatree.

Installation

pip install multiscale_spatial_image

Usage

import numpy as np
from spatial_image import to_spatial_image
from multiscale_spatial_image import to_multiscale
import zarr

# Image pixels
array = np.random.randint(0, 256, size=(128,128), dtype=np.uint8)

image = to_spatial_image(array)
print(image)

An Xarray spatial-image DataArray. Spatial metadata can also be passed during construction.

<xarray.DataArray 'image' (y: 128, x: 128)> Size: 16kB
array([[170,  79, 215, ...,  31, 151, 150],
       [ 77, 181,   1, ..., 217, 176, 228],
       [193,  91, 240, ..., 132, 152,  41],
       ...,
       [ 50, 140, 231, ...,  80, 236,  28],
       [ 89,  46, 180, ...,  84,  42, 140],
       [ 96, 148, 240, ...,  61,  43, 255]], dtype=uint8)
Coordinates:
  * y        (y) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
  * x        (x) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
# Create multiscale pyramid, downscaling by a factor of 2, then 4
multiscale = to_multiscale(image, [2, 4])
print(multiscale)

A chunked Dask Array MultiscaleSpatialImage Xarray Datatree.

<xarray.DataTree>
Group: /
├── Group: /scale0
│       Dimensions:  (y: 128, x: 128)
│       Coordinates:
│         * y        (y) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
│         * x        (x) float64 1kB 0.0 1.0 2.0 3.0 4.0 ... 124.0 125.0 126.0 127.0
│       Data variables:
│           image    (y, x) uint8 16kB dask.array<chunksize=(128, 128), meta=np.ndarray>
├── Group: /scale1
│       Dimensions:  (y: 64, x: 64)
│       Coordinates:
│         * y        (y) float64 512B 0.5 2.5 4.5 6.5 8.5 ... 120.5 122.5 124.5 126.5
│         * x        (x) float64 512B 0.5 2.5 4.5 6.5 8.5 ... 120.5 122.5 124.5 126.5
│       Data variables:
│           image    (y, x) uint8 4kB dask.array<chunksize=(64, 64), meta=np.ndarray>
└── Group: /scale2
        Dimensions:  (y: 16, x: 16)
        Coordinates:
          * y        (y) float64 128B 3.5 11.5 19.5 27.5 35.5 ... 99.5 107.5 115.5 123.5
          * x        (x) float64 128B 3.5 11.5 19.5 27.5 35.5 ... 99.5 107.5 115.5 123.5
        Data variables:
            image    (y, x) uint8 256B dask.array<chunksize=(16, 16), meta=np.ndarray>

Map a function over datasets while skipping nodes that do not contain dimensions

import numpy as np
from spatial_image import to_spatial_image
from multiscale_spatial_image import skip_non_dimension_nodes, to_multiscale

data = np.zeros((2, 200, 200))
dims = ("c", "y", "x")
scale_factors = [2, 2]
image = to_spatial_image(array_like=data, dims=dims)
multiscale = to_multiscale(image, scale_factors=scale_factors)

@skip_non_dimension_nodes
def transpose(ds, *args, **kwargs):
    return ds.transpose(*args, **kwargs)

multiscale = multiscale.map_over_datasets(transpose, "y", "x", "c")
print(multiscale)

A transposed MultiscaleSpatialImage.

<xarray.DataTree>
Group: /
├── Group: /scale0
│       Dimensions:  (c: 2, y: 200, x: 200)
│       Coordinates:
│         * c        (c) int32 8B 0 1
│         * y        (y) float64 2kB 0.0 1.0 2.0 3.0 4.0 ... 196.0 197.0 198.0 199.0
│         * x        (x) float64 2kB 0.0 1.0 2.0 3.0 4.0 ... 196.0 197.0 198.0 199.0
│       Data variables:
│           image    (y, x, c) float64 640kB dask.array<chunksize=(200, 200, 2), meta=np.ndarray>
├── Group: /scale1
│       Dimensions:  (c: 2, y: 100, x: 100)
│       Coordinates:
│         * c        (c) int32 8B 0 1
│         * y        (y) float64 800B 0.5 2.5 4.5 6.5 8.5 ... 192.5 194.5 196.5 198.5
│         * x        (x) float64 800B 0.5 2.5 4.5 6.5 8.5 ... 192.5 194.5 196.5 198.5
│       Data variables:
│           image    (y, x, c) float64 160kB dask.array<chunksize=(100, 100, 2), meta=np.ndarray>
└── Group: /scale2
        Dimensions:  (c: 2, y: 50, x: 50)
        Coordinates:
          * c        (c) int32 8B 0 1
          * y        (y) float64 400B 1.5 5.5 9.5 13.5 17.5 ... 185.5 189.5 193.5 197.5
          * x        (x) float64 400B 1.5 5.5 9.5 13.5 17.5 ... 185.5 189.5 193.5 197.5
        Data variables:
            image    (y, x, c) float64 40kB dask.array<chunksize=(50, 50, 2), meta=np.ndarray>

While the decorator allows you to define your own methods to map over datasets in the DataTree while ignoring those datasets not having dimensions, this library also provides a few convenience methods. For example, the transpose method we saw earlier can also be applied as follows:

multiscale = multiscale.msi.transpose("y", "x", "c")

Other methods implemented this way are reindex, equivalent to the xr.DataArray reindex method and assign_coords, equivalent to xr.Dataset assign_coords method.

Store as an Open Microscopy Environment-Next Generation File Format (OME-NGFF) / netCDF Zarr store.

It is highly recommended to use dimension_separator='/' in the construction of the Zarr stores.

store = zarr.storage.DirectoryStore('multiscale.zarr', dimension_separator='/')
multiscale.to_zarr(store)

Note: The API is under development, and it may change until 1.0.0 is released. We mean it :-).

Examples

Development

Contributions are welcome and appreciated.

Get the source code

git clone https://github.com/spatial-image/multiscale-spatial-image
cd multiscale-spatial-image

Install dependencies

First install pixi. Then, install project dependencies:

pixi install -a
pixi run pre-commit-install

Run the test suite

The unit tests:

pixi run -e test test

The notebooks tests:

pixi run test-notebooks

Update test data

To add new or update testing data, such as a new baseline for this block:

dataset_name = "cthead1"
image = input_images[dataset_name]
baseline_name = "2_4/XARRAY_COARSEN"
multiscale = to_multiscale(image, [2, 4], method=Methods.XARRAY_COARSEN)
verify_against_baseline(test_data_dir, dataset_name, baseline_name, multiscale)

Add a store_new_image call in your test block:

dataset_name = "cthead1"
image = input_images[dataset_name]
baseline_name = "2_4/XARRAY_COARSEN"
multiscale = to_multiscale(image, [2, 4], method=Methods.XARRAY_COARSEN)

store_new_image(dataset_name, baseline_name, multiscale)

verify_against_baseline(dataset_name, baseline_name, multiscale)

Run the tests to generate the output. Remove the store_new_image call.

Then, create a tarball of the current testing data

cd test/data
tar cvf ../data.tar *
gzip -9 ../data.tar
python3 -c 'import pooch; print(pooch.file_hash("../data.tar.gz"))'

Update the test_data_sha256 variable in the test/_data.py file. Upload the data to web3.storage. And update the test_data_ipfs_cid Content Identifier (CID) variable, which is available in the web3.storage web page interface.

Submit the patch

We use the standard GitHub flow.

Create a release

This section is relevant only for maintainers.

  1. Pull git's main branch.
  2. pixi install -a
  3. pixi run pre-commit-install
  4. pixi run -e test test
  5. pixi shell
  6. hatch version <new-version>
  7. git add .
  8. git commit -m "ENH: Bump version to <version>"
  9. hatch build
  10. hatch publish
  11. git push upstream main
  12. Create a new tag and Release via the GitHub UI. Auto-generate release notes and add additional notes as needed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multiscale_spatial_image-2.0.2.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

multiscale_spatial_image-2.0.2-py3-none-any.whl (29.1 kB view details)

Uploaded Python 3

File details

Details for the file multiscale_spatial_image-2.0.2.tar.gz.

File metadata

File hashes

Hashes for multiscale_spatial_image-2.0.2.tar.gz
Algorithm Hash digest
SHA256 3b0ef86559fd9369ca384b657a740edfaf567b308021ac51281a3efdb6ccdcfc
MD5 7a83f8776e68f72faaee10171474e739
BLAKE2b-256 dd192e00ede43955912951b62d6551498d4c53baed3504e00e954cc444c52a40

See more details on using hashes here.

File details

Details for the file multiscale_spatial_image-2.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for multiscale_spatial_image-2.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0c25fe7d743293bec06811ccd19842c0998dcdb99b58a0102116340d97d7dfcc
MD5 169f0ce7bc577cbf22db6b13365a79b2
BLAKE2b-256 4f02eff2e0a291201b70f5c342a8dac0ec67c75aa912af50a999054b99755832

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page