Skip to main content

Cell and nucleus segmentation for whole slide images (H&E and MIF)

Project description

VitaminP: Cell & Nuclei Segmentation for H&E and Multiplex IF

Python 3.8+ PyTorch License

VitaminP is a deep learning model for robust cell and nuclei segmentation in H&E and multiplex immunofluorescence (MIF) images. Supports whole slide images (WSI) with automatic resolution matching and tissue detection.


🚀 Quick Start (30 seconds)

import torch
from vitaminp import VitaminPFlex
from vitaminp.inference import WSIPredictor

# Load model
model = VitaminPFlex(model_size='large').to('cuda')
model.load_state_dict(torch.load("checkpoints/vitamin_p_flex_large.pth"))
model.eval()

# Run inference
predictor = WSIPredictor(model=model, device='cuda')
results = predictor.predict(
    wsi_path='slide.svs',
    output_dir='results',
    branch='he_nuclei',
    save_geojson=True
)

print(f"✅ Found {results['num_detections']} nuclei in {results['processing_time']:.2f}s")

That's it! Results saved to results/ with GeoJSON annotations and visualizations.


📦 Installation

# Clone repository
git clone https://github.com/yourusername/vitaminp.git
cd vitaminp

# Install dependencies
pip install -e .

Requirements: Python 3.8+, PyTorch 2.0+, CUDA 11.8+ (for GPU)


📖 Basic Usage

H&E Nuclei Detection

import torch
from vitaminp import VitaminPFlex
from vitaminp.inference import WSIPredictor

# Setup model
device = 'cuda'
model = VitaminPFlex(model_size='large').to(device)
model.load_state_dict(torch.load("checkpoints/vitamin_p_flex_large_fold2_best.pth"))
model.eval()

# Create predictor
predictor = WSIPredictor(
    model=model,
    device='cuda',
    patch_size=512,
    overlap=64,
    target_mpp=0.25,      # Auto-detected from file if available
    magnification=40
)

# Run inference
results = predictor.predict(
    wsi_path='slide.svs',
    output_dir='results',
    branch='he_nuclei',
    filter_tissue=True,           # Skip background tiles
    tissue_threshold=0.1,         # 10% minimum tissue
    clean_overlaps=True,          # Remove duplicates at tile boundaries
    save_geojson=True,            # Save annotations
    save_visualization=True,      # Save overlay images
    detection_threshold=0.5,      # Binary threshold (0.5-0.8)
    min_area_um=3.0,             # Filter small artifacts (μm²)
)

print(f"✅ Found {results['num_detections']} nuclei")
print(f"   Output: {results['output_dir']}")

Multiplex IF (MIF) Segmentation

from vitaminp.inference import ChannelConfig

# Define channel mapping
config = ChannelConfig(
    nuclear_channel=0,           # DAPI/SYTO channel
    membrane_channel=[1, 2],     # Membrane markers
    membrane_combination='max',  # Combine channels via max projection
    channel_names={0: 'SYTO13', 1: 'Cy3', 2: 'TexasRed'}
)

# Create predictor with MIF config
predictor = WSIPredictor(
    model=model,
    device='cuda',
    mif_channel_config=config,
    target_mpp=0.5,
    magnification=20
)

# Run MIF inference
results = predictor.predict(
    wsi_path='mif_image.tif',
    output_dir='results_mif',
    branch='he_nuclei',          # Uses same model weights
    save_geojson=True,
    min_area_um=5.0
)

Dual Modality (H&E + MIF)

Use MIF predictions (cleaner) with H&E visualization:

from vitaminp import VitaminPDual

# Load dual model
model = VitaminPDual(model_size='base').to('cuda')
model.load_state_dict(torch.load("checkpoints/vitamin_p_dual_base.pth"))
model.eval()

# Setup predictor
predictor = WSIPredictor(
    model=model,
    device='cuda',
    mif_channel_config=config
)

# Process both modalities
results = predictor.predict(
    wsi_path='he_image.png',           # H&E image
    wsi_path_mif='mif_image.png',      # Co-registered MIF
    output_dir='results_dual',
    branches=['he_nuclei', 'he_cell', 'mif_nuclei', 'mif_cell'],
    save_geojson=True
)

# H&E results now use high-quality MIF predictions automatically!
print(f"H&E nuclei: {results['he_nuclei']['num_detections']}")
print(f"MIF nuclei: {results['mif_nuclei']['num_detections']}")

Key feature: When using dual models, H&E branches automatically use MIF predictions (better quality) while keeping H&E background for visualization.


📊 Output Files

Running inference creates the following files:

results/
├── nuclei_detections.geojson    # QuPath-compatible annotations
├── nuclei_detections.json       # Raw instance data
├── nuclei_boundaries.png        # Visualization with contours
└── nuclei_centroids.csv         # (optional) Centroid coordinates

GeoJSON format is compatible with QuPath for interactive viewing.


🎯 Common Recipes

Process Multiple Branches

results = predictor.predict(
    wsi_path='slide.svs',
    branches=['he_nuclei', 'he_cell'],  # Process both
    output_dir='results'
)

print(f"Nuclei: {results['he_nuclei']['num_detections']}")
print(f"Cells: {results['he_cell']['num_detections']}")

Override MPP (for images without metadata)

results = predictor.predict(
    wsi_path='image.png',
    mpp_override=0.25,  # Force 0.25 μm/pixel
    branch='he_nuclei'
)

Custom Area Filtering

results = predictor.predict(
    wsi_path='slide.svs',
    branch='he_nuclei',
    min_area_um=5.0,           # Filter nuclei < 5 μm²
    detection_threshold=0.6     # Higher threshold = fewer false positives
)

Batch Processing

import glob
from pathlib import Path

slides = glob.glob('slides/*.svs')

for slide_path in slides:
    slide_name = Path(slide_path).stem
    results = predictor.predict(
        wsi_path=slide_path,
        output_dir=f'results/{slide_name}',
        branch='he_nuclei',
        save_geojson=True
    )
    print(f"{slide_name}: {results['num_detections']} nuclei")

🔧 Model Checkpoints

Download pre-trained models:

Model Size Modality Download
VitaminPFlex Large H&E or MIF Link
VitaminPFlex Base H&E or MIF Link
VitaminPDual Base H&E + MIF Link

Place checkpoints in checkpoints/ folder.


🤔 Troubleshooting

"Out of memory" error

predictor = WSIPredictor(
    model=model,
    patch_size=512,
    overlap=32,  # Reduce from 64
    mixed_precision=True  # Enable FP16
)

No MPP in metadata

results = predictor.predict(
    wsi_path='image.png',
    mpp_override=0.25,  # Manually specify
    branch='he_nuclei'
)

Too many false positives

results = predictor.predict(
    wsi_path='slide.svs',
    detection_threshold=0.7,  # Increase from 0.5
    min_area_um=5.0,         # Filter small detections
    branch='he_nuclei'
)

📚 Citation

If you use VitaminP in your research, please cite:

@article{vitaminp2025,
  title={VitaminP: Robust Cell Segmentation for H&E and Multiplex IF},
  author={Your Name},
  journal={arXiv},
  year={2025}
}

📄 License

MIT License - see LICENSE file.


🙋 Support


Made with ❤️ for the computational pathology community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vitaminp-0.1.0.tar.gz (1.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vitaminp-0.1.0-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file vitaminp-0.1.0.tar.gz.

File metadata

  • Download URL: vitaminp-0.1.0.tar.gz
  • Upload date:
  • Size: 1.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vitaminp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 848b891aef7c6068925e0779aa1c0302bb83e63bba7763f7e99dca2d5df569fe
MD5 1cc97a5d96387558a228ffcad222d4d7
BLAKE2b-256 46233f2063263adb4f478c42560454416e761cfcd15d48980e5d69afe3672d36

See more details on using hashes here.

File details

Details for the file vitaminp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vitaminp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for vitaminp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4d9e6187432d044aa43ab8de6d873eecaf85578956c14201d7a0afbb96514afc
MD5 4a4ca52a61644d7748d8954e712c17b0
BLAKE2b-256 dc228e06658fa26d5e1b944261af391e4da85a6e39d2c93110d70eb9a866b5f2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page