Skip to main content

Compute region properties using dask delayed and dataframe

Project description

dask-regionprops

License PyPI Python Version CI codecov

About

This is a small library that uses dask to compute regionprops in parallel.

In addition to parallelization, it adds a few features/specializations on top of the scikit-image regionprops implementation.

  1. dask_regionprops will return a dask dataframe containing the region properties as columns.
  2. Arrays can be numpy or dask arrays as well as xarray DataArrays backed by either array libary.
  3. ND arrays get processed as a sequence of 2D arrays. Typically we assume that the last two dimenions contain the images and the leading dimensions will be looped over.
  4. In the ND case, the result dataframe will have columns that map each label

Intended Use Case

I wrote this library to help analyze microscopy datasets. After segmentation I typically have a 4D xarray DataArray where the dimensions are (Position, Time, Y, X). Importantly, I reuse label values between positions but not times so for all of the time-points in position S, the region labelled r should refer to the same cell. Hopefully this motivated the decision to return the leading dimensions in the dataframe. For instance if you want to get the properties of a cell 5 in position 2 you could do something like:

from dask_regionprops import regionprops

# Assume data is a numpy/dask array that has dims corresponding to (S,T,Y,X)
props = regionprops(data)
single_cell_props = props.loc[(props["dim-0"]==2)&(props["label"]==5)]

If you are a more advanced pandas user, and you want to do this sort of analysis for many cells, you might consider using the leading dimensions and region labels as a multiindex to more efficiently access the data in this way.

Finally, a useful downstream application is to use the region properties as features for a classifer or maybe even a clustering algorithm. I have personally input labelled regions and the corresponding fluorescence images to identify progression through the cell cycle.

Contributions

Please feel free to open an issue or pull-request if you have questions or improvements for this library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dask_regionprops-0.3.0.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

dask_regionprops-0.3.0-py2.py3-none-any.whl (6.7 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file dask_regionprops-0.3.0.tar.gz.

File metadata

  • Download URL: dask_regionprops-0.3.0.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for dask_regionprops-0.3.0.tar.gz
Algorithm Hash digest
SHA256 6c6e471bb3977bdf0bff6c02c20ec2e5fd725850aa4478cfd26f9e327ead53e8
MD5 e94bccb130e25543cb367c3e305422ac
BLAKE2b-256 cfb6a61d5c3078bf998b35fd887c0531d8883caab1564484e9a7b927fa65d344

See more details on using hashes here.

File details

Details for the file dask_regionprops-0.3.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for dask_regionprops-0.3.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 b593dde93cd5c29e58afea663b5a8fb52f3bb9e66e0058dab773b3b77f70a7e5
MD5 c260ffa12996868355a28ef113ec638e
BLAKE2b-256 7a1f25e46df4a284cc3f4a1beea0dbcfb5b8e0ca43cf0a0b89397db1da25fd14

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page