Skip to main content

A microscopy image processing toolkit.

Project description

Magnify

A Python toolkit for processing microscopy images. Magnify makes it easy to work with terabyte-scale imaging datasets on your laptop. It provides a unified interface for any task that involves finding and processing markers of interest such as beads, droplets, cells, and microfluidic device components.

Magnify comes with predefined processing pipelines that support file-parsing, image stitching, flat-field correction, image segmentation, tag identification, and marker filtering across many different marker types. Magnify's pipelines allow you to process your images in just a few lines of code, while also being easy to extend with your own custom pipeline components.

Setup

pip install magnify

Usage

Here's a minimal example of how to use magnify to find, analyze, and visualize lanthanide-encoded beads given a microscopy image.

import magnify as mg
import magnify.plot as mp

# Process the experiment and get the output as an xarray dataset.
pipe = mg.mrbles(search_channel="620", min_bead_radius=10)
xp = pipe("example.ome.tif")

# Get the mean bead area and intensity.
print("Mean Bead Area:", xp.fg.sum(dim=["roi_x", "roi_y"]).mean())
print("Mean Bead Intensity:", xp.where(xp.fg).roi.mean())

# Show all the beads and how they were segmented.
mp.imshow(xp, animation_frame="channel")

Core Concepts

Output Format

Magnify outputs its results as xarray datasets. If you are unfamiliar with xarrays you might want to read this quick overview after you're setup with magnify. An xarray dataset is essentially a dictionary of arrays with named dimensions. Let's look at the Jupyter notebook output for a simple example where we've only used magnify to segment beads.

In most cases the actual data only consists of the processed images and regions of interest (ROI) around segmented markers. We also have coordinates which are arrays that represent metadata in our dataset, such as the location of the foreground (fg) and background (bg) in each ROI. The image below shows a graphical illustration of these concepts.

In this example the image array was 2-dimensional (image_height x image_width) and the ROI array was 3-dimensional (num_markers x roi_height x roi_width). However, magnify can also process stacks of images that were collected across multiple timepoints and color channels, so the image array can have up to 4 dimensions (num_timepoints x num_channels x ROI_height x ROI_width) and the ROI array can have up to 5 dimensions (num_markers x num_timepoints x num_channels x ROI_height x ROI_width).

Also important for large datasets is how the data is stored. The fg, bg, roi, image arrays are stored on your hard drive rather than on RAM using Dask. This allows you to interact with much larger datasets at the cost of slower access times. You usually don't need to worry about this since Dask operates on subsets of the array in an intelligent way. But if you're finding that your analysis takes too long you might want to compute some summary information (e.g. the mean intensity of each marker) that fits in RAM, load that array into memory with compute, and interact primarily with that array moving forward.

File Parsing

Since a single experiment can consist of many files spread out across many folders, magnify allows you to retrieve many files using a single string. For example, let's say you've acquired an image across multiple channels stored in the following folder structure:

.
├── egfp/
│   └── image1.tif
├── cy5/
│   └── image2.tif
└── dapi/
    └── image3.tif

You can load all these images into magnify with

xp = pipe("(channel)/*.tif")

The search string supports globs so * expands to match anything that matches the pattern. (channel) also expands like * but it also saves the segment of the file path it matches in the resulting dataset as the channel name. The specifiers that allow you to read metadata from the file path are:

  • (assay): The name of distinct experiments, if this is provided magnify returns a list of datasets (rather than a single dataset).
  • (time): The time at which the image was acquired in format YYYYmmDD-HHMMSS. If your files specify acquisition times in a different format you can write (time|FORMAT) where FORMAT is a legal format code for Python's strptime (e.g. (time|%H:%M:%S)).
  • (channel): The channel in which the image was acquired.
  • (row) and (col): In the case of a tiled image these two specifiers indicate the row and column of the subimages. Magnify will stitch all these tiles into one large image.
  • Alternate coordinates: You can also attach additional information to each coordinate using a specifier that looks like: (INFO_COORD) where COORD is the name of the original coordinate and INFO is the name for the attached information for example, (concentration_time). By default magnify encodes the information as strings but you can specify alternate formats using (INFO__COORD|FORMAT) where FORMAT can be: time, int, or float.

Magnify can read any TIFF image file. It can also read OME-TIFF files that were generated by micromanager. We plan to add support for other input formats as needed.

Pipelines

If you don't need customized pipelines and just want to use the predefined pipelines you can skip this section. Magnify's pipeline system is heavily inspired by spaCy's pipelines, so if you're familiar with that library you might only need to skim this section.

TODO: Write this. For now you can read the spacy pipeline docs to get an idea of the design philosophy in magnify.

Plotting

Magnify includes a plotting sublibrary which you can import with import magnify.plot as mp. It is designed to enable rapid prototyping of interactive visualization, primarily for the purpose of troubleshooting experiments rather than creating publication-ready figures. The plotting library is still under development and isn't currently stable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

magnify-0.6.3.tar.gz (38.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

magnify-0.6.3-py3-none-any.whl (40.4 kB view details)

Uploaded Python 3

File details

Details for the file magnify-0.6.3.tar.gz.

File metadata

  • Download URL: magnify-0.6.3.tar.gz
  • Upload date:
  • Size: 38.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.10.9 Linux/6.2.0-37-generic

File hashes

Hashes for magnify-0.6.3.tar.gz
Algorithm Hash digest
SHA256 904e974bcec28f53bc75b57d67561b279ae734b59fb6308bf8c060367db4975f
MD5 6a6242fa973ef231e20b252a59c40149
BLAKE2b-256 7d0b3cba26282c5026e22beaba60a0e7877334e7beb4b11a245be48a4ca9be9d

See more details on using hashes here.

File details

Details for the file magnify-0.6.3-py3-none-any.whl.

File metadata

  • Download URL: magnify-0.6.3-py3-none-any.whl
  • Upload date:
  • Size: 40.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.1 CPython/3.10.9 Linux/6.2.0-37-generic

File hashes

Hashes for magnify-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 945b5c27701cedc2b8e780458341666a4ccef0a7c696f86e504d37877a7f7c2f
MD5 30781c5a948a95721d768b5fe7d2696d
BLAKE2b-256 333519275cd5b26d6eba240d616c5aa60a1156e9bbcb7244a4aac1dc8bbcbcc2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page