Skip to main content

BioTuring cell segmentation inference framework

Project description

Turing Segment: High-performance Cellpose Algorithm

Turing Segment is a high-performance package for cell segmentation based on the popular Cellpose algorithm. It is designed to provide lightning-fast performance while maintaining the accuracy and robustness of the original Cellpose model.

Explore our blog post for in-depth details.

Features

  • Built on top of the Cellpose framework, leveraging its proven segmentation capabilities.
  • Our new post-processing algorithm is significantly faster than the original Cellpose algorithm, reducing computational overhead and enabling faster processing times.
  • Support Tiled processing that is optimized for handling large images. This allows for segmentation of high-resolution or whole-slide images without running into memory constraints.
  • Turing Segment is highly parallelized, leveraging both CPU and GPU resources to achieve accelerated processing speeds.
  • The package is designed to be easy to use. It provides a simple CLI for quick integration into your image analysis workflows.

Requirements

  • NVIDIA GPU (recommend 40xx series, but it should work on lower series as well)
  • CUDA Runtime 11.7 or later
  • Python 3.8 or later
  • PyTorch 2.0 or later
  • geopandas 1.0 or later

Installation

We recommend using conda to install Turing Segment. Run the following command to create a new conda environment:

conda create -n turing_segment python=3.10
conda activate turing_segment

Make sure your PyTorch version is compatible with the CUDA Runtime version and select the correct PyTorch version from the PyTorch website. For example, if you have CUDA 12.1, use the following command to install PyTorch:

conda install pytorch pytorch-cuda=12.1 -c pytorch -c nvidia

Install Turing Segment using pip:

pip install -U turing_segment

Usage

We provide three types of commands:

  • infer: segment the image
  • poly2mask: convert polygons to zarr labeled mask
  • stitch: Stitch 2D cell segmentation results into 3D cells

To see the full list of available options, use the --help flag.

turing_segment --help

Inference

1. Single image segmentation

tuirng_segment infer --image-path /path/to/image --output-dir /path/to/output_dir

By default, the model used is cyto3, if you want to use HE model, you must provide the --model-type he2 argument or --model-type he for old HE version.

tuirng_segment infer --image-path /path/to/image --model-type he2 --output-dir /path/to/output_dir

Output format:

/path/to/output_dir/
├── metadata.json
└── polygons.parquet
  • metadata.json contains the information of the image, including the image-shape, scale, model-type, ... of the image.
  • polygons.parquet contains the polygons of the segmented cells in Shapely geometry format.

If the --output-dir is not provided, the output dir will be created with the name {image_name}_{page}_{channels}_{model_type}.


2. Segmentation for multiple images

To segment multiple images, replace the --image-path with --image-dir to specify the folder containing multiple images.

tuirng_segment infer --image-dir /path/to/image_dir --output-dir /path/to/output_dir

If the --output-dir is not provided, the output dir will be created with the name {image_dir}_results.

Output format:

/path/to/output_dir/
├── configs.yml
├── image_name1/
│   ├── metadata.json
│   └── polygons.parquet
├── image_name2/
│   ├── metadata.json
│   └── polygons.parquet
└── ...

configs.yml will be automatically generated to store the postprocess config for each image. We only support for storing the config for 3D stitching. If you want to run 3D stitching, you must modify z_index for each image name in the config file.

Example of configs.yml:

stitch:
- image_name: image_name1
  z_index: null
- image_name: image_name2
  z_index: null

3. Specify the channels

To specify channels to segment, use the --channels flag. The channels are specified as a comma-separated list of channel indices. You must past the channel indices in the order of the model input.

  • Fluorescence image: The first channel is for membrane and the second channel is for nucleus. If the nucleus channel is not specified, a zero channel is used.

    tuirng_segment infer --image-path /path/to/image --model-type cyto3 --channels 0,1
    
  • Hematoxylin and eosin (H&E) image: The first channel is for red, the second channel is for green, and the third channel is for blue (RGB).

    tuirng_segment infer --image-path /path/to/image --model-type he2 --channels 0,1,2
    

If the image has the channel in the last dimension, use the --channel-last flag.


4. Specify the image-type

You can specify the image type explicitly using the --image-type flag. If the flag is not specified, the image type is inferred from the input file. Currently, tiff, zarr and cv2 are supported image types.

turing_segment infer --image-path /path/to/image --image-type tiff

5. Specify the config-file

The tool also supports modifying some parameters of the segmentation process and post-processing. This can be done by using --config-file to specify a YAML configuration file containing the parameters:

turing_segment infer --image-path /path/to/image --config-file /path/to/config.yaml

A sample config file can be found from configs/config.yaml, the parameters from the file are also the default parameters for the segmentation process and post-processing:

pipeline:
  infer_size: 1024              # Tile size (in pixels) to feed the model. The size is of the scaled tiles, not the original tiles from dividing the input image.
  overlap_margin: 32            # Margin (in pixels) to overlap between tiles during inference to reduce edge artifacts. Also of the scaled tiles.
  n_postprocess_processes: 32   # Number of parallel processes used for post-processing
  postprocess_queue_size: 128   # Size of the queue to store model output before post-processing
  n_merge_tile_processes: 16    # Number of parallel processes used to merge tiles into the final output

postprocess:
  niter: 200                    # Number of iterations for following the flow
  cellprob_threshold: 0         # Threshold for the probability of a pixel being part of a cell (in logit; 0 corresponds to 0.5 probability)
  flow_threshold: 0.4           # Threshold for the flow magnitude to filter out low-confidence regions
  min_size: 15                  # Minimum size (in pixels) for objects to be considered as cells
  resample: false               # Whether to resample the output to match the original image size when postprocessing (false to keep the inferred size). When scale > 1, set to true will improve postprocessing performance

6. Checkpoint download

By default, the model checkpoints are downloaded if they are not present in the cache directory. If you want to use a custom checkpoint from cellpose training pipeline, you can specify the paths to the model and size model checkpoints using the --model-path and --size-model-path flags, respectively. You may skip specifying --size-model-path if a provided scale is specified by --scale.

turing_segment infer \ 
--image-path /path/to/image \
--model-type <cyto_or_he> \
--model-path /path/to/model_checkpoint \
--size-model-path /path/to/size_model_checkpoint

Poly2mask

We provide a command to convert the polygons to zarr labeled mask. The default behavior should be getting shape from metadata.json in the same folder with the polygons file. If the metadata.json is not found, we will default the mask shape as max of the polygons coordinates. If your polygons not generated from our segmentation, please make sure the input polygon_dir contains the polygons.parquet

turing_segment poly2mask /path/to/polygons --output-dir /path/to/output_dir

If the --output-dir is not provided, the output will be created in the same folder with the polygons file with the name mask.zarr.


3D Stitch

We provide a command to stitch 2D segmentation results to 3D segmentation results. With each polygon in the $i'th$ image will be matched with the polygon in the $(i+1)'th$ image with the highest IOU (intersection over union). If the highest iou is less than the threshold, the polygon will be considered as a new object. If --iou-threshold is not provided, we will default the iou threshold as 0.5.

turing_segment stitch /path/to/segmentation_results --output-dir /path/to/output_dir --iou-threshold 0.5 --num-process 8

The /path/to/segmentation_results is the folder contains all the folder of the segmentations. Each folder must contain the polygons.parquet and metadata.json files. The metadata.json file must contain the z_index field to indicate the z-index of the image.

If --output-dir is not provided, the output will be saved in the same folder of the segmentation results with the name 3d_polygons.parquet.

Default the stitching algorithm will run in 8 processes. For disable multi-process (only run in main process), set --num-process to 0.

For example of each command, please refer to the EXAMPLE.md file.

Benchmark

  1. Processing Time:

    • Turing Segment significantly outperforms the original Cellpose, especially for larger images.
    • For a 40,000 x 40,000 pixel image, Turing Segment is 294 times faster, reducing processing time from hours to less than 1 minute.

    Processing Time Processing Time Ratio

  2. Memory Consumption:

    • Turing Segment uses considerably less memory than the original Cellpose.
    • For a 40,000 x 40,000 pixel image, Turing Segment consumes 23 times less memory.

    Memory Consumption Memory Consumption Ratio

  3. Accuracy:

    • Turing Segment maintains comparable accuracy to the original Cellpose algorithm.

    Accuracy

These improvements allow Turing Segment to process larger images more efficiently while maintaining accuracy.

Feedback

If you encounter any issues or bugs while using Turing Segment, please let us know by submitting an issue on this GitHub repository.

For other feedback or support, you can reach out to our dedicated support team at support@bioturing.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

turing_segment-0.3.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (446.5 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

turing_segment-0.3.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (401.4 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

turing_segment-0.3.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (402.4 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

turing_segment-0.3.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (402.9 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file turing_segment-0.3.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for turing_segment-0.3.9-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a9cd9c33197d5a33834e82e14d9fb2c50d322f4ca895bb603008920b58b5b410
MD5 1499c0009b8c1ff516b1de781035b520
BLAKE2b-256 a841aa2b8cff551ff35bb681a8a724b3e03f54709df8fca3e27e0cf1c012b304

See more details on using hashes here.

File details

Details for the file turing_segment-0.3.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for turing_segment-0.3.9-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a9b4e3987261754c9b48385730259d7217ebdfc34c918aeca5fe68bb776ed566
MD5 094cf1eadb71a03f9cd06f656edbd87a
BLAKE2b-256 87338a2951f47ab6315b22bcc4dbf806b0fa1ecd4de8c531b9d462f4a4f88234

See more details on using hashes here.

File details

Details for the file turing_segment-0.3.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for turing_segment-0.3.9-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ca2d0db33aba24982b1af2848886d33d056bc3a01420f1145100b5a8b4c51c8e
MD5 93e0e61be52df2d4f498d36087f0eec2
BLAKE2b-256 5bf645bbcf71a905af7a4359c6c340d9330fadc422aa8e5483dd8d20685f6d2c

See more details on using hashes here.

File details

Details for the file turing_segment-0.3.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for turing_segment-0.3.9-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 81e56c7e8dc80e6a937b6f909cb66bd056de5ac6306f6c94ba4f8c73c61f2b52
MD5 d1b99a56ca33cee0c3498d9d5f91f5a4
BLAKE2b-256 0c3bc8c3134565538a091aa9896d510a2213a40f8571d1c4386c9270cd6eb726

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page