Framework for Cell & Region Image Segmentation/Classification
Project description
CRISP - Cell and Region Image Segmentation for Pathology
A Generic Framework for Developing Algorithms for Histopathological Image Analysis
Use Cases
Cell Detection and Classification • Region Segmentation • Cell Segmentation
Table of Contents
- Project Structure
- Dependencies
- Main Technologies
- Setup and Installation
- Quickstart
- Workflow
- Use Cases
- Best Practices
- Contributing guidelines
- Support
Project Structure
├── configs/ # Configuration files
├── data/ # Data files
├── docs/ # Documentation files
├── notebooks/ # Jupyter notebooks
├── scripts/ # Scripts for various tasks
├── src/ # Source code
│ └── roche/
│ └── crisp/
│ ├── callbacks/ # Callback functions
│ ├── datamodules/ # Data modules
│ ├── inference/ # Inference scripts
│ ├── metrics/ # Metrics calculation
│ ├── model_engines/ # Model engines
│ ├── networks/ # Network architectures
│ ├── utils/ # Utility functions
│ ├── main.py # Entrypoint script
│ └── version.py # Version information
├── tests/ # Test code
├── .gitignore # Git ignore file
├── .pre-commit-config.yaml # Pre-commit hooks configuration
├── environment.yaml # Conda environment configuration
├── pyproject.toml # Python project configuration
├── README.md # Project README
└── run_task.sh # Helper script for tasks
Dependencies
- Python: 3.11+
- PyTorch: 2.5+
- CUDA: 12.1
Main Technologies
Setup and Installation
Installation for CRISP users
You can directly install the CRISP package via PIP. Follow these steps to authenticate and install from the GitLab Package Registry:- Authenticate with GitLab Package Registry:
- Follow the official instructions to authenticate with the GitLab Package Registry. Setting up authentication with a group is sufficient to be able to access all packages within that group.
- Create a Personal Access Token:
- Follow the official instructions to create a personal access token.
- Update pip configuration:
- Update your
~/.config/pip/pip.conffile with the following content:
[global]
index-url = https://pypi.org/simple
extra-index-url = https://__token__:<access_token>@code.roche.com/api/v4/projects/424003/packages/pypi/simple
trusted-host = pypi.org
code.roche.com
- Install the package
pip install roche.crisp
Setup for developers of CRISP
Prerequisites - Visual Studio Code is recommended
To setup the CRISP package for development, follow these steps:
- Clone the repository
git clone https://code.roche.com/rds-csi-dp/crisp-ai.git cd crisp-ai
Configuring the environment
We use micromamba for environment configuration and uv for package management.
1. Install micromamba:
- On sHPC systems: Load micromamba by running:
ml Python - On other Linux systems: Install micromamba by running:
curl -Ls https://micro.mamba.pm/api/micromamba/linux-64/latest | tar -xvj bin/micromamba
2. Create and activate virtual environment:
-
Create a conda environment and install the package with all its dependencies using the provided helper script:
./run_task.sh install -
To install in
editable mode, run below command using the provided helper script:./run_task.sh install-db -
To install packages without new virtual environment, activate your preferred environment, navigate to the root of the repository, and run pip install from source. For example, for the editable mode, execute:
pip install uv uv pip install -e .
NOTE: INSTALLATION OF EXTRA PACKAGES THROUGH CHANNEL OTHER THAN conda-forge IS RESTRICTED.
Quickstart
crisp --config-name=train_detection.yaml
bash scripts/run_main.sh 'crisp' lfs /home/user/global/nuclei-segmentation-classification/configs train_detection.yaml 1
Above commands internally call the src/roche/crisp/main.py.
CRISP Workflow
The main.py script serves as the entry point for the CRISP framework. Here's a brief overview of its functionalities:
- Configuration Management: Loads and parses configuration files using Hydra.
- Reproducibility: Ensures reproducible results by setting seeds for pseudo-random number generators.
- Data Module Initialization: Instantiates the data module for data loading and preprocessing.
- Model Initialization: Sets up the network and model engine, with optional JIT compilation for faster training.
- Callbacks and Logger Setup: Initializes callbacks and configures Weights & Biases (wandb) for experiment tracking.
- Trainer Initialization: Creates a PyTorch Lightning trainer with support for distributed training.
- Training, Testing, and Prediction: Executes training, testing, or prediction based on the specified pipeline_mode.
Example configuration file: train_detection.yaml
defaults:
- _self_
- datamodule: detection_data
- network: resunet
- model_engine: cell_detection
- logging
- experiment: mosaic-detection
# logger args
logger:
_target_: lightning.pytorch.loggers.WandbLogger
mode: online
project: crisp
entity: csi-dp
name: mosaic-detection
save_dir: ${oc.env:HOME}/crisp-logs
log_model: true
# define callbacks
callbacks:
model_checkpoint:
_target_: lightning.pytorch.callbacks.ModelCheckpoint
monitor: validate_macro_f1
mode: max
wandb_log_predictions:
_target_: roche.crisp.callbacks.DataVisualizer
class_map: ${model_engine.metrics.detection_stats.class_map}
# early_stopping:
# _target_: lightning.pytorch.callbacks.EarlyStopping
# monitor: val_total_f1
# patience: 50
# mode: max
model_summary:
_target_: lightning.pytorch.callbacks.ModelSummary
max_depth: -1
learning_rate_monitor:
_target_: lightning.pytorch.callbacks.LearningRateMonitor
logging_interval: 'step'
log_momentum: True
# trainer args
resume_from_checkpoint_path: null
trainer:
_target_: lightning.Trainer
fast_dev_run: false
num_sanity_val_steps: 0
accelerator: auto
strategy: auto
devices: 1
enable_model_summary: true
max_epochs: 1500
precision: 32
benchmark: false
deterministic: false
# flag to use pytorch's latest JIT-compiling for faster training
torch_compile: false
Use Cases
Cell Detection and classification
Train cell detection and classification model on your custom dataset. Dataset should include images with point masks.
Example:
Data Preparation
-
Convert point masks
- Use the point2gauss utility to convert point masks to multi-channel gaussian masks.
-
Prepare CSV files
- Create CSV files for your training, validation, and test datasets in the following format:
imagefile,point_mask,gaussian_mask /path/to/image1.png,/path/to/point_mask1.png,/path/to/gaussian_mask1.tiff /path/to/image2.png,/path/to/point_mask2.png,/path/to/gaussian_mask2.tiff /path/to/image3.png,/path/to/point_mask3.png,/path/to/gaussian_mask3.tiff ... -
Configuration
train_detection.yamlcontains the default configurations.- To override these settings, create an experiment YAML file, for example,
configs/experiment/mosaic-detection.yamland specify the configurations relevant to your dataset, such as the network architecture, number of classes, and dataset CSV file paths. - Set
experimentstomosaic-detectionintrain_detection.yaml.
Region Segmentation
Train a region segmentation model on your custom dataset. The dataset should include images with region masks.
Example:
Data Preparation
-
Prepare CSV files
- Create CSV files for your training, validation, and test datasets in the following format:
image,mask /path/to/image1.png,/path/to/region_mask1.npy /path/to/image2.png,/path/to/region_mask2.npy /path/to/image3.png,/path/to/region_mask3.npy ... -
Configuration
train_region_segmentation.yamlcontains the default configurations.- To override these settings, create an experiment YAML file, for example,
configs/experiment/usz_region_segmentation.yamland specify the configurations relevant to your dataset, such as the network architecture, number of classes, and dataset CSV file paths. - Set
experimentstousz_region_segmentationintrain_region_segmentation.yaml.
Cell Segmentation
Train a cell segmentation model on your custom dataset. The dataset should include images with cell masks.
Example:
Data Preparation
-
Convert instance masks
- The current implementation requires cell masks to be converted into flows for the training.
- Use the labels_to_flows utility to perform the conversion.
-
Prepare CSV files
- Create CSV files for your training, validation, and test datasets in the following format:
- train_dataset.csv and validation_dataset.csv
image,mask /path/to/image1.png,/path/to/mask1_flows.tif /path/to/image2.png,/path/to/mask2_flows.tif /path/to/image3.png,/path/to/mask3_flows.tif ...- test_dataset.csv
image,mask /path/to/image1.png,/path/to/mask1.png /path/to/image2.png,/path/to/mask2.png /path/to/image3.png,/path/to/mask3.png ...
- Create CSV files for your training, validation, and test datasets in the following format:
-
Configuration
train_cellpose.yamlcontains the default configurations.- To override these settings, create an experiment YAML file, for example,
configs/experiment/cellpose-segmentation.yamland specify the configurations relevant to your dataset, such as the network architecture, number of classes, and dataset CSV file paths. - Set
experimentstocellpose-segmentationintrain_cellpose.yaml.
Best Practices
Style Conventions
Follow common style conventions to make code more readable, maintainable, and consistent. The repository comes with a set of pre-configured defaults that automate most of this process.
Format Python source code in compliance with PEP 8.
Coding style
Highlight compliance issues in your editor using a flake8 plugin or extension. Auto-format source code using black. Annotate all parameters and return values with type hints in compliance with PEP 484 using type definitions provided by the typing module. Check type hints using the mypy command line tool. Document every package, module, class, function, method, property, constant, and attribute with a docstring in compliance with PEP 257. Use the NumPy style for docstrings.
To do a static code analysis run execute:
./run_task.sh lint
To automatically format code run:
./run_task.sh fmt
This code analysis will also be triggered is required to pass upon all commits.
Using pre-commit
Pre-commit hooks help improve quality of commits, by making sure your commits meet some minimal
requirements. For details, please see the Coding style section above.
The hooks are defined in .pre-commit-config.yaml.
Execute the command to do a static code analysis (./run_task.sh lint) once to set up the
hooks. Post that, pre-commit will run automatically on every git commit!
You can use the ./run_task.sh fmt command to automatically format your code to be compliant with most of the static code analysis.
Unit testing
Write and run unit tests using pytest.
You can run tests using the pytest command line tool with the command below:
./run_task.sh test
For guidelines and best practices on how to write unit tests, kindly refer to the docs of this package.
Versioning
This package has automate versioning system. Each merge to dev branch will be deployed as unstable version: X.Y.Z.devT+githash where X.Y.Z is number version dev is name of branch and added last commit hash. This automate version is should be used by expirienced users that can track potentaial bugs. Official releases needs to be tagged manually. Manual tag will affect X.Y.Z of unstable version.
Contributing guidelines
- Begin by opening an issue with a concise, descriptive title (this will determine the branch name).
- Create a branch from the issue (instructions can be found here).
- Initiate a merge request targeting the dev branch.
- It is possible to create a merge request before it is ready for review - simply mark it as a draft.
- Developers should respond to reviewer comments, but only the reviewer should resolve them.
- The developer must provide the identifier of the commit (or the corresponding link) that addresses the comment. This makes it easier for the reviewer to track and determine whether the comment has been adequately addressed.
- Once approved, anyone is permitted to click the merge button.
Support
For any questions or doubts, please reach out to the maintainers:
You can also join our slack support channel for real-time support and discussions:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file roche_crisp-0.0.2-py3-none-any.whl.
File metadata
- Download URL: roche_crisp-0.0.2-py3-none-any.whl
- Upload date:
- Size: 172.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3c3f4e15c2bfaef157d7cfd0286774f653b6980fa9fe84bf0da6094e98d8460
|
|
| MD5 |
9ce502a31be01cf61232703c6bedf326
|
|
| BLAKE2b-256 |
ad91b4c17cd4ecd3708756836bc62c2e827aa7236d68220bca68e4d278c3eea1
|