Skip to main content

Code to perform analysis on segmentations like those produced by CellMap

Project description

CI Status Codecov

cellmap-analyze

A suite of Dask-powered tools for processing and analyzing terabyte-scale 3D segmentation datasets. Supports both isotropic and anisotropic voxel sizes.


Features

Processing Tools

Tool CLI Command Description
Connected Components connected-components Threshold predictions, apply masks, and extract connected components. Volume thresholds are in physical units (nm³).
Clean Components clean-connected-components Refine existing segmentations by removing small/large components.
Contact Sites contact-sites Identify regions where two segmentations are within a configurable physical distance. Handles mismatched voxel sizes by resampling to a common resolution.
Fill Holes fill-holes Fill interior gaps in segmented volumes.
Filter IDs filter-ids Exclude unwanted segmentation IDs.
Mutex Watershed mws Mutex watershed agglomeration from affinities.
Label With Mask label-with-mask Label one dataset with IDs from another.
Morphological Operations morphological-operations Erosion and dilation of segmented datasets. Processing order across blocks is not guaranteed.
Skeletonize skeletonize Generate skeletons from segmented objects with optional pruning and simplification. Automatically resamples to isotropic resolution before skeletonization.

Analysis Tools

Tool CLI Command Description
Measurement measure Compute metrics (volume, surface area, radius of gyration, bounding box) for objects and contact sites. Supports raw intensity statistics when a raw dataset is provided.
Fit Lines fit_lines_to_segmentations Fit geometric lines to elongated/cylindrical structures.
Assign to Organelles assign_to_organelles Map segmented objects to organelles based on centers of mass.

Anisotropic data

All operations handle anisotropic voxel sizes (e.g. (8, 8, 32) nm in ZYX). Physical-unit parameters like minimum_volume_nm_3, contact_distance_nm, and gaussian_smoothing_sigma_nm are automatically converted to the appropriate per-axis voxel units. When two datasets have different voxel sizes, they are resampled to a common resolution using nearest-neighbor interpolation.


Installation

pip install cellmap-analyze

Usage

All commands share the same basic interface:

<command> [options] <config_path>
  • <command>: One of the processing or analysis tools listed above.

  • <config_path>: Directory containing:

    • run-config.yaml (parameters for your chosen command)
    • dask-config.yaml (Dask cluster settings)

Options:

  • -n, --num-workers N: Number of Dask workers to launch.

Output: A new directory named config_path-<YYYYMMDDHHMMSS> will be created, containing copies of your configs and an output.log for monitoring.


Configuration Examples

The following run-config.yaml could be used to run connected-components.

run-config.yaml

input_path: /path/to/predictions.zarr/mito/s0
output_path: /path/to/segmentations.zarr/mito
intensity_threshold_minimum: 0.71
minimum_volume_nm_3: 1E7
delete_tmp: true
connectivity: 1
mask_config:
  cell:
    path: /path/to/masks.zarr/cell/s0
    mask_type: inclusive
fill_holes: true

dask-config.yaml

The following dask-config.yaml files can be used for a variety of tasks.

Local

jobqueue:
  local:
    ncpus: 1
    processes: 1
    cores: 1
    log-directory: job-logs
    name: dask-worker

distributed:
  scheduler:
    work-stealing: true

LSF Cluster

jobqueue:
  lsf:
    ncpus: 8        # cores per job chunk
    processes: 12  # worker processes per chunk
    cores: 12      # threads per process (1 thread each)
    memory: 120GB  # 15 GB per slot
    walltime: 08:00
    mem: 12000000000
    use-stdin: true
    log-directory: job-logs
    name: cellmap-analyze
    project: charge_group

distributed:
  scheduler:
    work-stealing: true
  admin:
    log-format: '[%(asctime)s] %(levelname)s %(message)s'
    tick:
      interval: 20ms
      limit: 3h

Submission

To run on 12 dask workers:

Local run example:

connected-components -n 12 config_path

Cluster submit example (LSF):

bsub -n 4 -P chargegroup connected-components -n 12 config_path

Acknowledgements

The center-finding implementation is taken from funlib.evaluate.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cellmap_analyze-0.2.1.tar.gz (925.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cellmap_analyze-0.2.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

cellmap_analyze-0.2.1-cp312-cp312-macosx_10_13_universal2.whl (1.7 MB view details)

Uploaded CPython 3.12macOS 10.13+ universal2 (ARM64, x86-64)

File details

Details for the file cellmap_analyze-0.2.1.tar.gz.

File metadata

  • Download URL: cellmap_analyze-0.2.1.tar.gz
  • Upload date:
  • Size: 925.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for cellmap_analyze-0.2.1.tar.gz
Algorithm Hash digest
SHA256 f7f4e3169c98133393c2eac4c4d6f1cfe308f62a632dc577af4faa731703bf23
MD5 28fc6ddee3c693a0e61f8a650d84a003
BLAKE2b-256 ebaf42466aad52442c32e8bfdf3d8aaa2a6f39d0bd3436f72d394d2b60c39777

See more details on using hashes here.

File details

Details for the file cellmap_analyze-0.2.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for cellmap_analyze-0.2.1-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 1c7cd8d5175f0c1c0d6f888732aa100365e8c27de58dcb5a830d5db68010e401
MD5 d1c71f412e0366ab16e9d78c60a2aa5e
BLAKE2b-256 81ada1f8adeab625466082e272986d4b89e1f3ab5df2f13334b658a9b9a4c760

See more details on using hashes here.

File details

Details for the file cellmap_analyze-0.2.1-cp312-cp312-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for cellmap_analyze-0.2.1-cp312-cp312-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 95b7e84265b63e690e731e9a767a368b9f35ec81f4371042be40fdef7c9616f8
MD5 72fbf974a3abdd252c5b6c4b41326b0e
BLAKE2b-256 6e49b8633d504558d54f576c8654791e9d97bf13076e494720f0de13cdecdde3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page