Skip to main content

Cell and nucleus segmentation for whole slide images (H&E and MIF)

Project description

Vitamin-P: a vision transformer-assisted multimodal integration network for pathology cell segmentation

PyPI version Docker Python 3.8+ PyTorch License

Whole-cell segmentation from H&E — no immunofluorescence required at inference.

📄 Paper🐛 Issues🐳 Docker📦 PyPI



Overview

VitaminP is a VIT-based segmentation framework that learns cytoplasmic boundaries from paired H&E–MIF training data, enabling whole-cell segmentation directly from routine H&E slides — without requiring immunofluorescence at inference time.

Trained on 14 public datasets · 34 cancer types · 7M+ annotated instances.


Installation

pip (Python API):

pip install vitaminp

Docker (recommended for servers/HPC — includes CUDA 12.1 + pretrained weights):

docker pull ghcr.io/idso-fa1-pathology/vitaminp:latest

⚠️ NumPy/OpenCV conflict? Run: pip install "numpy<2" --force-reinstall


Models

Model Input Use case Size
flex H&E or MIF or IHC General purpose — start here large
dual H&E + MIF (paired) Maximum accuracy with both modalities base
syn H&E only Whole-cell when no MIF available base

Available branches:

Branch Description
he_nuclei H&E nuclei segmentation
he_cell H&E whole-cell segmentation
mif_nuclei MIF nuclei segmentation
mif_cell MIF whole-cell segmentation

💡 Tip: Running he_nuclei + he_cell together activates joint inference — nuclei predictions constrain cell boundaries for better accuracy.


Quick Start

import vitaminp

# Load pretrained model — downloads once, cached forever
model = vitaminp.load_model('flex', device='cuda')

# See all available models
vitaminp.available_models()

Python API

H&E Segmentation (most common)

import vitaminp
from vitaminp.inference import WSIPredictor

model = vitaminp.load_model('flex', device='cuda')

predictor = WSIPredictor(
    model=model,
    device='cuda',
    patch_size=512,
    overlap=64,
    target_mpp=0.4250,   # microns per pixel
    magnification=20,
    batch_size=32,        # lower to 4–8 if out of GPU memory
    tissue_dilation=1,
)

results = predictor.predict(
    wsi_path='slide.svs',
    output_dir='results/',
    branches=['he_nuclei', 'he_cell'],
    filter_tissue=True,
    tissue_threshold=0.10,
    clean_overlaps=True,
    save_geojson=True,    # QuPath-compatible output
    min_area_um=10.0,
)

print(f"Nuclei: {results['he_nuclei']['num_detections']}")
print(f"Cells:  {results['he_cell']['num_detections']}")

MIF Segmentation

import vitaminp
from vitaminp.inference import WSIPredictor
from vitaminp.inference.channel_config import ChannelConfig

model = vitaminp.load_model('flex', device='cuda')

# Define which channels correspond to nucleus and membrane
config = ChannelConfig(
    nuclear_channel=2,           # e.g. DAPI
    membrane_channel=[0, 1],     # e.g. cell markers
    membrane_combination='max',
    channel_names={0: 'CellMarker1', 1: 'CellMarker2', 2: 'DAPI'}
)

predictor = WSIPredictor(
    model=model,
    device='cuda',
    patch_size=512,
    overlap=64,
    target_mpp=0.4250,
    magnification=20,
    mif_channel_config=config,   # required for MIF input
    batch_size=16,
)

results = predictor.predict(
    wsi_path='mif_image.tif',
    output_dir='results/',
    branches=['mif_nuclei', 'mif_cell'],
    filter_tissue=True,
    clean_overlaps=True,
    save_geojson=True,
    detection_threshold=0.2,
    min_area_um=5.0,
)

print(f"MIF Nuclei: {results['mif_nuclei']['num_detections']}")
print(f"MIF Cells:  {results['mif_cell']['num_detections']}")

Dual Model — Paired H&E + MIF

import vitaminp
from vitaminp.inference import WSIPredictor
from vitaminp.inference.channel_config import ChannelConfig

model = vitaminp.load_model('dual', device='cuda')

config = ChannelConfig(
    nuclear_channel=2,
    membrane_channel=[0, 1],
    membrane_combination='max',
    channel_names={0: 'CellMarker1', 1: 'CellMarker2', 2: 'DAPI'}
)

predictor = WSIPredictor(
    model=model,
    device='cuda',
    patch_size=512,
    overlap=64,
    target_mpp=0.4250,
    magnification=20,
    mif_channel_config=config,
    batch_size=4,
)

results = predictor.predict(
    wsi_path='he_image.png',
    wsi_path_mif='mif_image.png',   # co-registered MIF
    output_dir='results/',
    branches=['he_nuclei', 'he_cell', 'mif_nuclei', 'mif_cell'],
    filter_tissue=True,
    clean_overlaps=True,
    save_geojson=True,
    min_area_um=5.0,
)

print(f"H&E nuclei:  {results['he_nuclei']['num_detections']}")
print(f"H&E cells:   {results['he_cell']['num_detections']}")
print(f"MIF nuclei:  {results['mif_nuclei']['num_detections']}")
print(f"MIF cells:   {results['mif_cell']['num_detections']}")

🐳 Docker

The Docker image has CUDA 12.1, all dependencies, and pretrained weights pre-installed at /workspace/checkpoints/. No setup needed.

H&E inference

docker run --gpus all --rm \
  -v /your/images:/data \
  -v /your/results:/results \
  ghcr.io/idso-fa1-pathology/vitaminp:latest \
  python3 /workspace/scripts/run_wsi_inference.py \
    --model_type flex \
    --model_size large \
    --checkpoint /workspace/checkpoints/vitamin_p_flex.pth \
    --wsi_path /data/slide.svs \
    --output_dir /results \
    --branches he_nuclei he_cell \
    --target_mpp 0.4250 \
    --magnification 20 \
    --batch_size 32 \
    --filter_tissue \
    --save_geojson \
    --min_area_um 10.0

Batch folder inference

docker run --gpus all --rm \
  -v /your/images:/data \
  -v /your/results:/results \
  ghcr.io/idso-fa1-pathology/vitaminp:latest \
  python3 /workspace/scripts/run_wsi_inference.py \
    --model_type flex \
    --model_size large \
    --checkpoint /workspace/checkpoints/vitamin_p_flex.pth \
    --wsi_folder /data \
    --wsi_extension svs \
    --output_dir /results \
    --branches he_nuclei he_cell \
    --target_mpp 0.4250 \
    --magnification 20 \
    --batch_size 32 \
    --filter_tissue \
    --save_geojson \
    --min_area_um 10.0

MIF inference

docker run --gpus all --rm \
  -v /your/images:/data \
  -v /your/results:/results \
  ghcr.io/idso-fa1-pathology/vitaminp:latest \
  python3 /workspace/scripts/run_wsi_inference.py \
    --model_type flex \
    --model_size large \
    --checkpoint /workspace/checkpoints/vitamin_p_flex.pth \
    --wsi_path /data/mif_image.tif \
    --output_dir /results \
    --branches mif_nuclei mif_cell \
    --mif_nuclear_channel 2 \
    --mif_membrane_channels 0,1 \
    --target_mpp 0.4250 \
    --magnification 20 \
    --batch_size 16 \
    --filter_tissue \
    --save_geojson \
    --min_area_um 5.0

Key CLI arguments

Argument Description
--model_type flex or dual
--model_size base or large
--checkpoint Path to .pth weights file
--wsi_path Single image
--wsi_folder Folder of images (use with --wsi_extension)
--branches Space-separated: he_nuclei he_cell mif_nuclei mif_cell
--target_mpp Microns per pixel (default: 0.25)
--magnification 20 or 40
--batch_size Lower if CUDA out of memory
--mif_nuclear_channel Channel index for nucleus (MIF only)
--mif_membrane_channels Comma-separated channel indices, e.g. 0,1
--detection_threshold 0.50.8 (higher = fewer false positives)
--min_area_um Minimum object area in μm²
--save_geojson QuPath-compatible output
--save_parquet Fast binary format
--save_visualization PNG overlay images

Output Files

results/
├── he_nuclei_detections.geojson     # QuPath-compatible nuclei annotations
├── he_cell_detections.geojson       # QuPath-compatible cell annotations
├── mif_nuclei_detections.geojson    # (if MIF branches used)
├── mif_cell_detections.geojson
├── he_nuclei_boundaries.png         # Visualization overlay
└── inference.log                    # Full run log

GeoJSON output loads directly into QuPath for interactive review.


Troubleshooting

Problem Fix
CUDA out of memory Lower batch_size to 4–8
No MPP metadata in image Add mpp_override=0.4250 (Python) or --wsi_properties '{"slide_mpp":0.4250}' (Docker)
Too many false positives Increase detection_threshold=0.7 and min_area_um=10.0
NumPy / OpenCV error pip install "numpy<2" --force-reinstall
Wrong MIF channels Set mif_channel_config (Python) or --mif_nuclear_channel + --mif_membrane_channels (Docker)
Stale file handle on HPC Use --output_dir /tmp/results then copy out after inference
No internet for backbone Mount cache: -v ~/.cache/huggingface:/root/.cache/huggingface

Citation

If you use VitaminP in your research, please cite:

@article{shokrollahi2025vitaminp,
  title   = {Vitamin-P: vision transformer assisted multi-modality integration
             network for pathology cell segmentation},
  author  = {Shokrollahi, Yasin and Pinao Gonzales, Karina and Barrientos Toro, Elizve
             and Acosta, Paul and Chen, Pingjun and Yuan, Yinyin and Pan, Xiaoxi},
  journal = {},
  year    = {2025}
}

License

MIT License — see LICENSE.



Department of Translational Molecular Pathology · Institute for Data Science in Oncology -->

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vitaminp-0.3.1.tar.gz (5.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vitaminp-0.3.1-py3-none-any.whl (424.2 kB view details)

Uploaded Python 3

File details

Details for the file vitaminp-0.3.1.tar.gz.

File metadata

  • Download URL: vitaminp-0.3.1.tar.gz
  • Upload date:
  • Size: 5.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vitaminp-0.3.1.tar.gz
Algorithm Hash digest
SHA256 775cde4af733bfa86d4c93549b542ea23c2e05155be9bf0a573258ad4c6e93e3
MD5 182db8f486579436a05f90abbcca8949
BLAKE2b-256 1fce70b85df8ff3db6f643a855426a8fc332073f83ff956a54f5bc62182124df

See more details on using hashes here.

File details

Details for the file vitaminp-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: vitaminp-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 424.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vitaminp-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9e4488d51899bcd3e847dc48d1aba9ce9df3a5a67d28086bc8fd910c8797d3b3
MD5 f56f7e957720778374ad974db2060b36
BLAKE2b-256 a8c46150d401fb793c21c358b1e50ce20aa795917fe39991701d2346502d77d8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page