Cell and nucleus segmentation for whole slide images (H&E and MIF)
Project description
VitaminP: Cell & Nuclei Segmentation for H&E and Multiplex IF
VitaminP is a cross-modal deep learning framework for cell and nuclei segmentation in H&E and multiplex immunofluorescence (MIF) whole slide images. Built on DINOv2 vision transformers, it learns from paired H&E–MIF data to infer cytoplasmic boundaries that are invisible in standard brightfield microscopy — enabling whole-cell segmentation directly from H&E.
Trained on 14 public datasets across 34 cancer types and 7M+ annotated instances.
📦 Installation
pip install vitaminp
⚠️ If you see a NumPy/OpenCV conflict:
pip install "numpy<2" --force-reinstall
🗺️ Which Model Should I Use?
| Model | Input | Best For | Speed |
|---|---|---|---|
flex |
H&E or MIF (any channel) | General purpose — most users start here | ⚡⚡⚡ Fastest |
dual |
H&E + MIF (paired) | Best whole-cell accuracy when both modalities available | ⚡⚡ |
syn |
H&E only | H&E whole-cell when no MIF available | ⚡⚡ |
What branch should I run?
| Goal | branches= |
|---|---|
| Nuclei only | ['he_nuclei'] |
| Cells only | ['he_cell'] |
| Both (recommended) | ['he_nuclei', 'he_cell'] — nuclei constrain cells for better accuracy |
🚀 Quick Start
import vitaminp
model = vitaminp.load_model('flex') # downloads once, cached forever
vitaminp.available_models() # list all models
📖 Usage
1. Flex — General Purpose (H&E or MIF)
H&E input (most common):
import vitaminp
from vitaminp.inference import WSIPredictor
model = vitaminp.load_model('flex', device='cuda')
predictor = WSIPredictor(
model=model,
device='cuda',
patch_size=512,
overlap=64,
target_mpp=0.4250,
magnification=20,
batch_size=32, # lower to 4-8 if out of memory
tissue_dilation=1,
)
results = predictor.predict(
wsi_path='slide.svs',
output_dir='results/',
branches=['he_nuclei', 'he_cell'], # or just ['he_nuclei'] or ['he_cell']
filter_tissue=True,
tissue_threshold=0.10,
clean_overlaps=True,
save_geojson=True,
min_area_um=10.0,
)
print(f"✅ Nuclei: {results['he_nuclei']['num_detections']}")
print(f"✅ Cells: {results['he_cell']['num_detections']}")
MIF input — set channel config so the model knows which channels are nucleus vs membrane:
import vitaminp
from vitaminp.inference import WSIPredictor
from vitaminp.inference.channel_config import ChannelConfig
model = vitaminp.load_model('flex', device='cuda')
config = ChannelConfig(
nuclear_channel=2, # e.g. DAPI
membrane_channel=[0, 1], # e.g. cell markers
membrane_combination='max',
channel_names={0: 'CellMarker1', 1: 'CellMarker2', 2: 'DAPI'}
)
predictor = WSIPredictor(
model=model,
device='cuda',
patch_size=512,
overlap=64,
target_mpp=0.4250,
magnification=20,
mif_channel_config=config, # required for MIF input
batch_size=16,
)
results = predictor.predict(
wsi_path='mif_image.tif',
output_dir='results/',
branches=['he_nuclei', 'he_cell'],
filter_tissue=True,
clean_overlaps=True,
save_geojson=True,
save_visualization=True,
detection_threshold=0.2,
min_area_um=5.0,
)
print(f"✅ Nuclei: {results['he_nuclei']['num_detections']}")
print(f"✅ Cells: {results['he_cell']['num_detections']}")
2. Dual — Paired H&E + MIF (best whole-cell accuracy)
Use this when you have co-registered H&E and MIF from the same tissue section. The model fuses both signals to resolve cytoplasmic boundaries that are ambiguous in H&E alone.
import vitaminp
from vitaminp.inference import WSIPredictor
from vitaminp.inference.channel_config import ChannelConfig
model = vitaminp.load_model('dual', device='cuda')
config = ChannelConfig(
nuclear_channel=2,
membrane_channel=[0, 1],
membrane_combination='max',
channel_names={0: 'CellMarker1', 1: 'CellMarker2', 2: 'DAPI'}
)
predictor = WSIPredictor(
model=model,
device='cuda',
patch_size=512,
overlap=64,
target_mpp=0.4250,
magnification=20,
mif_channel_config=config,
batch_size=4,
)
results = predictor.predict(
wsi_path='he_image.png', # H&E
wsi_path_mif='mif_image.png', # co-registered MIF
output_dir='results/',
branches=['he_nuclei', 'he_cell'],
filter_tissue=True,
clean_overlaps=True,
save_geojson=True,
save_visualization=True,
detection_threshold=0.2,
min_area_um=5.0,
)
print(f"✅ H&E nuclei: {results['he_nuclei']['num_detections']}")
print(f"✅ H&E cells: {results['he_cell']['num_detections']}")
📊 Output Files
results/
├── he_nuclei_detections.geojson # QuPath-compatible annotations
├── he_cell_detections.geojson
├── he_nuclei_boundaries.png # Visualization overlay
└── he_nuclei_centroids.csv # Centroid coordinates
GeoJSON output is directly compatible with QuPath.
🎯 Common Recipes
Batch Processing
import glob
from pathlib import Path
import vitaminp
from vitaminp.inference import WSIPredictor
model = vitaminp.load_model('flex', device='cuda')
predictor = WSIPredictor(model=model, device='cuda', batch_size=32)
for slide_path in glob.glob('slides/*.svs'):
name = Path(slide_path).stem
results = predictor.predict(
wsi_path=slide_path,
output_dir=f'results/{name}',
branches=['he_nuclei', 'he_cell'],
save_geojson=True,
min_area_um=10.0,
)
print(f"{name}: {results['he_nuclei']['num_detections']} nuclei")
Image Without MPP Metadata
results = predictor.predict(
wsi_path='image.png',
mpp_override=0.4250,
branches=['he_nuclei'],
)
🔧 Troubleshooting
| Problem | Fix |
|---|---|
| CUDA out of memory | Lower batch_size to 4–8 |
| No MPP in metadata | Add mpp_override=0.4250 |
| Too many false positives | Increase detection_threshold=0.7, min_area_um=10.0 |
| NumPy/OpenCV error | pip install "numpy<2" --force-reinstall |
| MIF channels wrong | Set mif_channel_config with correct channel indices |
📚 Citation
If you use VitaminP in your research, please cite:
@article{shokrollahi2025vitaminp,
title = {Vitamin-P: vision transformer assisted multi-modality integration
network for pathology cell segmentation},
author = {Shokrollahi, Yasin and Pinao Gonzales, Karina and Barrientos Toro, Elizve
and Acosta, Paul and Chen, Pingjun and Yuan, Yinyin and Pan, Xiaoxi},
journal = {arXiv},
year = {2025}
}
📄 License
MIT License — see LICENSE file.
🙋 Support
- 🐛 Issues: GitHub Issues
- 📧 Contact: MD Anderson Cancer Center — Department of Translational Molecular Pathology
Made with ❤️ at MD Anderson Cancer Center
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vitaminp-0.3.0.tar.gz.
File metadata
- Download URL: vitaminp-0.3.0.tar.gz
- Upload date:
- Size: 1.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5dc13df91fb4c24e0b966f4177962ae6a8f275c4ecba873c33f0aa16203a85c2
|
|
| MD5 |
305ed0ff1a35b0875c9ae090b71d2e68
|
|
| BLAKE2b-256 |
ab92cf3ff9f944e12fb3149a7dcb16a49598939948e2c4c62248bc4619067fef
|
File details
Details for the file vitaminp-0.3.0-py3-none-any.whl.
File metadata
- Download URL: vitaminp-0.3.0-py3-none-any.whl
- Upload date:
- Size: 2.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4769067bd2cc94dc552af599c5180aba7dd07ee9f0f00e29c10b49f7b81b8da
|
|
| MD5 |
ab77b52b054afae4cd6f7f2127e6dd9e
|
|
| BLAKE2b-256 |
03ab90e52b75c3384a60261e56b8e557da611b8ce76aad35c164ecd2a2eadddf
|