Hierarchical, composable segmentation for biological image data (clean rewrite of VollSeg).
Project description
VollSeg
Hierarchical, composable segmentation for biological image data — a clean rewrite of the original VollSeg.
pip install kapoorlabs-vollseg # SDK only (PyTorch first-class)
pip install kapoorlabs-vollseg[napari] # SDK + the dock-widget plugin
pip install kapoorlabs-vollseg[keras] # SDK + legacy keras/csbdeep backend
pip install kapoorlabs-vollseg[all] # everything
PyTorch + PyTorch Lightning + CAREamics is the first-class backend. The original keras / csbdeep / stardist stack is kept as a legacy backend with a Keras suffix on every class name so already-trained .h5 weights still work.
What's new
- PyTorch StarDist inference is now first-class. End-to-end pipeline: tile + predict + stitch → peak detection → triangulated polyhedron rasterisation (matches upstream
stardist.polyhedron_to_label) → NMS → label image. Rays use the same golden-spiral parameterisation asstardist.Rays_GoldenSpiral(anisotropy convention included), so weights are transferable. from_folder(path)on every singleton. Pairs the Lightning.ckptwith atraining_config.jsonsidecar soconv_dims,unet_depth,n_rays, optimised thresholds, etc. are picked up automatically. Drop a folder on disk, callSingleton.from_folder(folder), done.- Multi-GPU timelapse prediction.
predict_timelapse(pipeline, volume, devices=N, strategy="ddp")shards the T-axis across GPUs via LightningTrainer.predict, gathers the per-rank outputs onto rank 0, and returns a stacked(T, …)result. Works for anyPipeline— singleton or composite. - HuggingFace auto-download with disk-priority. Predict scripts read
log_pathandhf_repo_idfrom Hydra YAMLs; on-disk path wins when it exists, HF download is the fallback. The new PyTorch model repos live underKapoorLabs(e.g.KapoorLabs/xenopus-stardist-pytorch,KapoorLabs/xenopus-unet-pytorch,KapoorLabs/xenopus-maskunet-pytorch); the legacy keras Xenopus zoo stays underKapoorLabs-Copenhagen. - StarDist threshold optimisation, cached.
scripts/model_training/optimize-stardist-thresholds.pyruns the network once per validation patch, precomputes peaks + rasterised polyhedra once at the lowest probability the sweep will visit, and reuses them across every(prob_thresh, nms_thresh)candidate. Writes results back intotraining_config.jsonso prediction picks them up automatically. - Predict scripts for every singleton + the composite:
scripts/model_prediction/predict-{care,roi,unet,stardist,combo}.py. All Hydra-driven, all support multi-GPU timelapse, all nest their output inside the input directory so files don't sprawl.
Local checkout — when developing from a git clone the napari extra cannot resolve from PyPI yet, so install the plugin from the in-repo path instead:
pip install -e . # SDK
pip install -e plugins/napari-vollseg # add the segmentation napari plugin
pip install -e plugins/napari-curvature # add the curvature napari plugin
Quick start
import numpy as np
from tifffile import imread
from kapoorlabs_vollseg import StarDistSegmenter, MaskUNetSegmenter, VollSeg
# Layer-1 singletons load themselves from Lightning checkpoints (PyTorch)
# or from the Zenodo / HuggingFace pretrained registry (legacy keras).
stardist = StarDistSegmenter.from_checkpoint(
"models/nuclei.ckpt",
rays=np.load("models/nuclei.rays.npy"),
)
roi = MaskUNetSegmenter.from_checkpoint("models/roi.ckpt")
# Layer-3 factory composes the right pipeline shape from the supplied models.
pipe = VollSeg.from_models(
stardist=stardist,
roi_unet=roi, # → wraps in ROIPipeline
seedpool=False,
)
result = pipe.predict(imread("data/sample.tif"))
print(result.labels.max(), "objects")
Why a rewrite?
The original VollSeg grew organically into a single utils.py with branching if/else chains for every combination of denoise / ROI / U-Net / StarDist / seedpool / 2D / 3D. Adding a mode meant editing the same mega-functions; testing one path required mocking the rest.
This rewrite replaces that with three orthogonal layers, composed at runtime.
Architecture
┌─────────────────────────────────┐
Layer 3 │ VollSeg.from_models(...) │ smart factory
│ → assembles the right pipeline │
│ VollCellSeg.from_models(...) │ sibling, for membrane
└─────────────────────────────────┘
│
┌──────────────┴──────────────────┐
Layer 2 │ Composite Pipelines │ composition, not inheritance
│ • UNetStarDistPipeline │
│ • NucleiSeededCellPosePipeline │
│ • DenoisedPipeline │
│ • ROIPipeline │
│ • Chunked │
└──────────────┬──────────────────┘
│
┌──────────────┴──────────────────┐
Layer 1 │ Singleton Models │ one model, one job
│ • CAREDenoiser │
│ • UNetSegmenter │
│ • MaskUNetSegmenter │
│ • StarDistSegmenter │
│ • CellPoseSegmenter │
└─────────────────────────────────┘
Layer 1 — Singletons
Identical contract:
class Pipeline(Protocol):
def predict(self, image: np.ndarray, **kwargs) -> Result: ...
| Class | Job | Output (Result.*) |
|---|---|---|
CAREDenoiser |
Denoise (CAREamics UNet, Lightning) | denoised |
UNetSegmenter |
Binary semantic segmentation + CC labels | labels, semantic, probability |
MaskUNetSegmenter |
Same as UNetSegmenter, separate weights |
labels, semantic, probability |
StarDistSegmenter |
Instance segmentation via radial dists | labels, probability |
CellPoseSegmenter |
Membrane/cell segmentation (CellPose) | labels |
2D vs 3D is dispatched inside each singleton on image.ndim — no parallel *2D / *3D class trees.
Layer 2 — Composites (built by wrapping, not subclassing)
# StarDist + U-Net, fused via SeedPool watershed
pipe = UNetStarDistPipeline(unet, stardist, seedpool=True)
# ...preceded by CARE denoising
pipe = DenoisedPipeline(care, downstream=pipe)
# ...gated by an ROI mask
pipe = ROIPipeline(roi_unet, downstream=pipe)
# ...executed in overlapping chunks for huge volumes
pipe = Chunked(pipe, chunk=(64, 256, 256), overlap=(8, 32, 32))
| Composite | Wraps | What it adds |
|---|---|---|
UNetStarDistPipeline |
unet + stardist | Runs both; if seedpool=True, fuses via watershed/IoU |
NucleiSeededCellPosePipeline |
nuclei pipe + cellpose | Nuclei labels seed a CellPose-gated membrane watershed |
DenoisedPipeline |
any downstream | CARE denoise → downstream |
ROIPipeline |
any downstream | U-Net mask → downstream restricted to ROI |
Chunked |
any downstream | Overlapping tiles → predict → label-safe stitch |
Layer 3 — Smart factories
pipe = VollSeg.from_models(
care=care_model, # optional → wraps in DenoisedPipeline
roi_unet=roi_model, # optional → wraps in ROIPipeline
unet=unet_model, # optional
stardist=stardist_model, # optional
seedpool=True, # only meaningful with both unet+stardist
chunk=(64, 256, 256), # optional → wraps in Chunked
)
# Sibling factory for membrane work — consumes a nuclei pipeline as input.
pipe = VollCellSeg.from_models(
nuclei_pipeline=nuclei_pipe,
cellpose=cellpose_model,
care=membrane_denoiser, # optional
nuclei_channel=1, membrane_channel=0,
)
Rule: provided models determine the pipeline shape; runtime knobs tune behavior. No silent fallbacks — invalid combinations raise at construction, not at .predict.
Two backends — PyTorch first-class, Keras legacy
| Concern | PyTorch (default) | Keras legacy |
|---|---|---|
| Class names | CAREDenoiser, UNetSegmenter, … |
CAREDenoiserKeras, UNetSegmenterKeras, … |
| Model arch | CAREamics UNet + Lightning | csbdeep CARE / stardist |
| Checkpoints | .ckpt (Lightning) — from_checkpoint(path) |
csbdeep folder (config.json + weights_*.h5) |
| Pretrained zoo | kapoorlabs_vollseg.hub.XENOPUS_MODELS (HuggingFace) |
kapoorlabs_vollseg.pretrained (Zenodo) |
Both implement the same Pipeline.predict(image) -> Result contract, so any composite or factory accepts either or both interchangeably.
The bare-named PyTorch classes are the supported direction. The *Keras variants exist to keep already-trained .h5 weights usable. Both backends now cover training and inference for every model — including StarDist 3D, which uses a triangulated star-convex polyhedron rasteriser equivalent to upstream stardist.polyhedron_to_label.
Prediction — from_folder, multi-GPU, HF auto-download
Every singleton exposes a from_folder constructor that pairs a Lightning .ckpt with the training_config.json sidecar the trainer writes. The loader picks up architecture knobs (conv_dims, unet_depth, in_channels, …), per-model thresholds (StarDist prob_thresh / nms_thresh), and — for StarDist — the rays.npy so inference reuses the same ray geometry the model was trained on.
from kapoorlabs_vollseg import StarDistSegmenter, predict_timelapse
# Single folder: ckpt + training_config.json + rays.npy.
star = StarDistSegmenter.from_folder("models/xenopus_stardist/")
# Single 3D volume.
result = star.predict(imread("frame.tif"))
# 4D (TZYX) timelapse, sharded across 4 GPUs.
out = predict_timelapse(star, imread("timelapse.tif"),
devices=4, strategy="ddp")
labels_tzyx = out["labels"] # full (T, Z, Y, X) stack on rank 0
predict_timelapse wraps any Pipeline (singleton or composite) in a thin TimelapsePredictor LightningModule and dispatches it via Trainer.predict with a DistributedSampler over T. Per-rank outputs are gathered onto rank 0 via torch.distributed.gather_object (so the 35 GB stack only lives on one rank), deduped against sampler-padding, sorted by T, and stacked.
Each predict script (scripts/model_prediction/predict-{care,roi,unet,stardist,combo}.py) supports the same priority on its log_path / hf_repo_id YAML entries: disk path wins when it exists; HF download is the fallback. Outputs land in <input_dir>/<output_dir>/<file>.tif so segmentation results are nested inside the raw folder.
Pretrained Xenopus model zoo (HuggingFace)
Two orgs, two backends:
KapoorLabs/— the new PyTorch model repos used by the predict scripts (xenopus-stardist-pytorch,xenopus-unet-pytorch,xenopus-maskunet-pytorch, …). Auto-downloaded when a script'shf_repo_idis set and the locallog_pathdoesn't exist on disk.KapoorLabs-Copenhagen/— the legacy keras / csbdeep / stardist models published with the original paper, kept around so already-trained.h5weights keep working. Resolved viakapoorlabs_vollseg.ensure_modelfrom theXENOPUS_MODELSregistry.
# Legacy keras zoo:
from kapoorlabs_vollseg import ensure_model, XENOPUS_MODELS
ensure_model("./models/StarDist3D", "nuclei_xenopus_mari")
# → downloads from KapoorLabs-Copenhagen/xenopus-stardist3d-nuclei-mari
The legacy registry mapping lives in src/kapoorlabs_vollseg/hub.py; the new PyTorch repos are addressed by the hf_repo_id entry in each predict YAML under scripts/conf/experiment_data_paths/. See scripts/README.md for the full table.
Curvature & force profiles
Segmentation is a means, not an end — once you have labels you usually want to measure something. The kapoorlabs_vollseg.curvature toolkit takes a label image (2D or 3D) and returns, for each region, a sliding-window curvature profile along its boundary or surface, plus optional Young-Laplace pressure and Helfrich bending-energy profiles when material constants are supplied.
from tifffile import imread
from kapoorlabs_vollseg.curvature import compute_curvature
labels = imread("data/segmented_cells.tif")
profiles = compute_curvature(
labels,
spacing=(2.0, 0.6918, 0.6918), # (dz, dy, dx) μm
n_window=21, stride=5,
geodesic=True, # mesh-aware neighbours in 3D
surface_tension=1e-3, # N/m — optional → Young-Laplace ΔP
bending_modulus=2e-20, # J — optional → Helfrich f
)
for label_id, profile in profiles.items():
print(label_id, profile.summary())
# profile.centers, .kappa, .normals, .radii — geometry
# profile.pressure — γκ (2D) or 2γH (3D)
# profile.bending_density — κ_b·(2H-C₀)² + κ_G·K
Pipeline:
- 2D —
skimage.measure.find_contoursper label → ordered sub-pixel contour → sliding window ofn_windowconsecutive points → Kasa algebraic circle fit →κ = ±1/r(sign fromdot(radius_vec, outward_normal)). - 3D —
skimage.measure.marching_cubesper label → triangle mesh + per-vertex outward normals → at everystride-th vertex, then_windownearest neighbours by geodesic distance along the mesh (BFS-hop default, Dijkstra optional, or Euclidean KDTree as opt-in) → Coope linear sphere fit → signed mean curvature. - Physics is bolt-on: pass
surface_tensionto get a Young-Laplace pressure column, passbending_modulus(and optionallyspontaneous_curvature/saddle_splay_modulus) for a Helfrich bending-energy column. Both are skipped when their constants are absent.
Anisotropic voxels are first-class: pass spacing=(dz, dy, dx) and the resulting curvatures come out in 1/length of that unit (so feed μm in, get 1/μm out).
Repository layout
KapoorLabs-VollSeg/
├── src/kapoorlabs_vollseg/
│ ├── _backbones/ csbdeep / stardist / careamics / cellpose backbone wrappers
│ ├── _lightning/ inlined Lightning support (CareModule, dataset, stitch, transforms)
│ ├── models/ Layer-1 singletons (PyTorch + Keras siblings)
│ ├── pipelines/ Layer-2 composites + Layer-3 factories
│ ├── stardist/ pure-PyTorch StarDist (rays, distance, model, losses, training, inference)
│ ├── curvature/ per-label curvature + Young-Laplace / Helfrich force profiles
│ ├── train/ Lightning + csbdeep trainers
│ ├── data/ file IO, label morphology, Sequence loaders, SmartPatches
│ ├── eval/ matching metrics, NMS, threshold optimization
│ ├── fusion.py watershed_fuse, cellpose_watershed_fuse
│ ├── hub.py HuggingFace auto-download for the Xenopus model zoo
│ ├── pretrained.py legacy Zenodo registry (csbdeep weights)
│ └── seedpool.py SeedPool / UnetStarMask geometry primitives
├── plugins/
│ └── napari/ kapoorlabs-vollseg-napari — QTabWidget dock plugin (PyTorch-only)
├── scripts/ Hydra-driven CLI: enhance, segment, score, train_stardist
├── docs/ Per-module READMEs: care.md, unet.md, stardist.md
├── tests/ pytest suite (PyTorch path; keras kept legacy)
├── pyproject.toml packaging + dependencies
├── setup.cfg setuptools metadata
└── update_version.py git-tag → src/kapoorlabs_vollseg/_version.py
Documentation
docs/care.md— CARE denoising in PyTorch (Backbone, Singleton, Trainer)docs/unet.md— U-Net + MaskUNet semantic segmentationdocs/stardist.md— full PyTorch StarDist rewrite (algorithm, training, inference)scripts/README.md— Hydra segmentation pipelines + the StarDist demo script + HF model upload
Design rules
- Composition over inheritance for combining behaviors — wrap, don't subclass.
- One responsibility per class. A class either trains, or predicts, or composes — never two.
- No 2D/3D class duplication. Dispatch on
ndiminside the class. - Runtime concerns are decorators. Chunking, ROI gating, and denoising all wrap a downstream pipeline; none are baked into the singletons.
- Fail at construction, not prediction. Invalid model combinations raise in
from_models, not mid-inference. - No silent fallbacks. If a user asks for
seedpool=Truewithout both models, raise. - Trainers produce models; they are not models.
Development
git clone https://github.com/Kapoorlabs-CAPED/KapoorLabs-VollSeg
cd KapoorLabs-VollSeg
pip install -e ".[testing]"
pre-commit install
pytest tests/ -v
The pre-commit hooks run pyupgrade (py39+), black, flake8, autoflake, plus a local update_version.py hook that syncs src/kapoorlabs_vollseg/_version.py from the most recent git tag.
License
BSD-3-Clause — see LICENSE. Same as upstream VollSeg.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kapoorlabs_vollseg-1.0.2.tar.gz.
File metadata
- Download URL: kapoorlabs_vollseg-1.0.2.tar.gz
- Upload date:
- Size: 908.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b518fe2ce817920c098c0fa14c4a217b856d74cb606526b85a0badb1da248bc5
|
|
| MD5 |
632b0c82f28825e65963292af982b38e
|
|
| BLAKE2b-256 |
64a8fc87f01d6348775079ae9723fa6b560f294683eb902e741d341c9b4b2e4e
|
File details
Details for the file kapoorlabs_vollseg-1.0.2-py3-none-any.whl.
File metadata
- Download URL: kapoorlabs_vollseg-1.0.2-py3-none-any.whl
- Upload date:
- Size: 178.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10696fc6d5c8ba088847faf5193ded9fb064cd84bf9c0dae60aff87b70be7623
|
|
| MD5 |
a082dbe9cb9ce7ea9f73d1351a26dd32
|
|
| BLAKE2b-256 |
79ac327a1a3ea64f08be316b4eb35caa3398489b8d748cc29a61bb82290d356f
|