An evaluation protocol for standard metrics per connected component

These details have not been verified by PyPI

Project links

Project description

CC-Metrics

Every Component Counts: Rethinking the Measure of Success for Medical Semantic Segmentation in Multi-Instance Segmentation Tasks

Description

Traditional metrics often fail to adequately capture the performance of models in multi-instance segmentation scenarios, particularly when dealing with heterogeneous structures of varying sizes. CC-Metrics addresses this by:

Identifying individual connected components in ground-truth labels
Creating Voronoi regions around each component to define its territory
Mapping predictions within each Voronoi region to the corresponding ground-truth component
Computing standard metrics on these mapped regions for more granular assessment

Below is an example visualization of the Voronoi-based mapping process:

CC-Metrics Workflow

For more details, you can read the full paper here.

Description
Installation
How to Use CC-Metrics
FAQ
Contributing
Citation
License

Installation

Prerequisites

Python 3.8+
PyTorch 1.8+
MONAI 0.9+

git clone https://github.com/alexanderjaus/CC-Metrics.git
cd CC-Metrics
pip install -e .

How to Use CC-Metrics

CC-Metrics defines wrappers around MONAI's Cumulative metrics to enable per-component evaluation.

Basic Usage

Here's a simple example using the CCDiceMetric:

from CCMetrics import CCDiceMetric
import torch

# Create the metric with desired parameters
cc_dice = CCDiceMetric(
    cc_reduction="patient",  # Aggregation mode
    use_caching=True,        # Enable caching for faster repeat evaluations
    caching_dir=".cache"     # Directory to store cached Voronoi diagrams
)

# Create sample prediction and ground truth tensors
# Tensors must be in shape (B, C, D, H, W) where:
# B = batch size (currently only B=1 is supported)
# C = number of channels (must be 2: background and foreground)
# D, H, W = depth, height, width of the volumetric data
y = torch.zeros((1, 2, 64, 64, 64))
y_hat = torch.zeros((1, 2, 64, 64, 64))

# Create two ground truth components
y[0, 1, 20:25, 20:25, 20:25] = 1  # Component 1
y[0, 1, 40:45, 40:45, 40:45] = 1  # Component 2
y[0, 0] = 1 - y[0, 1]  # Background

# Create prediction (slightly offset from ground truth)
y_hat[0, 1, 21:26, 21:26, 21:26] = 1  # Prediction for component 1
y_hat[0, 1, 41:46, 39:44, 41:46] = 1  # Prediction for component 2
y_hat[0, 0] = 1 - y_hat[0, 1]  # Background

# Compute the metric
cc_dice(y_pred=y_hat, y=y)

# Get the results
patient_wise_results = cc_dice.cc_aggregate()
#tensor([0.5120])

print(f"CC-Dice score: {patient_wise_results.mean().item()}")

# You can change the scheme during aggregation
component_wise_results = cc_dice.cc_aggregate(mode="overall")
#tensor([0.5120, 0.5120])

Supported Metrics

CC-Metrics includes the following metrics, all derived from MONAI:

CCDiceMetric: Component-wise Dice coefficient
```
CCDiceMetric()
```
CCHausdorffDistanceMetric: Component-wise Hausdorff distance
```
CCHausdorffDistanceMetric(metric_worst_score=30)
```
CCHausdorffDistance95Metric: Component-wise 95th percentile Hausdorff distance
```
CCHausdorffDistance95Metric(metric_worst_score=30)
```
CCSurfaceDistanceMetric: Component-wise average surface distance
```
CCSurfaceDistanceMetric(metric_worst_score=30)
```
CCSurfaceDiceMetric: Component-wise Surface Dice score
```
CCSurfaceDiceMetric(class_thresholds=[1])
```
This class needs the additional parameter class_thresholds, a list of class-specific thresholds. The thresholds relate to the acceptable amount of deviation in the segmentation boundary in pixels. Each threshold needs to be a finite, non-negative number. More details here

Metric Aggregation

The CCBaseMetric class supports two types of metric aggregation modes:

Patient-Level Aggregation (patient):
- Computes the mean metric score for each patient by aggregating all connected components within the patient
- Returns a list of mean scores, one for each patient
- Useful when you want to evaluate performance on a per-patient basis
Overall Aggregation (overall):
- Treats all connected components across all patients equally
- Aggregates the metric scores for all components into a single list
- Useful when you want to evaluate performance across all components regardless of patient boundaries

The aggregation mode can be specified using the cc_aggregate method, with the default mode being patient.

# Patient-level aggregation (default)
patient_results = cc_dice.cc_aggregate(mode="patient")

# Overall aggregation
overall_results = cc_dice.cc_aggregate(mode="overall")

Caching Mechanism

CC-Metrics requires the computation of a generalized Voronoi diagram which serves as the mapping mechanism between predictions and ground-truth. As the separation of the image space only depends on the ground-truth, the mapping can be cached and reused between intermediate evaluations or across metrics.

Benefits of Caching

Significantly faster repeated evaluations
Ability to precompute Voronoi regions for large datasets
Consistent component mapping across different metrics

Using the Caching Feature

Enable caching when instantiating any CC-Metrics metric:

cc_dice = CCDiceMetric(use_caching=True, caching_dir="/path/to/cache")

Precomputing Cache

For large datasets, you can precompute the Voronoi regions using the provided script:

python prepare_caching.py --gt /path/to/ground_truth_nifti_files --cache_dir /path/to/cache --nof_workers 8

This will process all .nii.gz files in the specified directory and store the computed Voronoi regions in the cache directory.

Advanced Examples

Evaluating Multiple Metrics on the Same Data

from CCMetrics import CCDiceMetric, CCSurfaceDiceMetric, CCHausdorffDistance95Metric
import torch

# Create sample data
y = torch.zeros((1, 2, 64, 64, 64))
y_hat = torch.zeros((1, 2, 64, 64, 64))

# Set up components (simplified example)
y[0, 1, 20:25, 20:25, 20:25] = 1
y[0, 0] = 1 - y[0, 1]
y_hat[0, 1, 21:26, 21:26, 21:26] = 1
y_hat[0, 0] = 1 - y_hat[0, 1]

# Define shared cache directory
cache_dir = ".cache"

# Initialize metrics
metrics = {
    "dice": CCDiceMetric(use_caching=True, caching_dir=cache_dir),
    "surface_dice": CCSurfaceDiceMetric(use_caching=True, caching_dir=cache_dir, class_thresholds=[1]),
    "hd95": CCHausdorffDistance95Metric(use_caching=True, caching_dir=cache_dir, metric_worst_score=30)
}

# Compute all metrics
results = {}
for name, metric in metrics.items():
    metric(y_pred=y_hat, y=y)
    results[name] = metric.cc_aggregate().mean().item()

print(f"Results: {results}")

FAQ

Q: Why use CC-Metrics instead of traditional metrics?

A: Traditional metrics like Dice can be misleading in multi-instance segmentation tasks. CC-Metrics provides a more granular assessment of performance by evaluating each component separately, making it particularly valuable for medical imaging tasks with multiple structures of varying sizes.

Q: How does CC-Metrics handle false negatives (ground truth components with no matching predictions)?

A: CC-Metrics assigns the worst score to false negative regions, ensuring they appropriately penalize the overall performance score.

Q: How does CC-Metrics handle false positives (predicted components with no matching ground truth)?

A: CC-Metrics evaluates locally thus positive predictions reduce the scores in the region into which they fall.

Q: Is multi-class segmentation supported?

A: Currently, CC-Metrics only supports binary segmentation (background and foreground). Multi-class support is planned for future releases.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Citation

If you make use of this project in your work, please cite the CC-Metrics paper:

@article{jaus2024every,
  title={Every Component Counts: Rethinking the Measure of Success for Medical Semantic Segmentation in Multi-Instance Segmentation Tasks},
  author={Jaus, Alexander and Seibold, Constantin Marc and Rei{\ss}, Simon and Marinov, Zdravko and Li, Keyi and Ye, Zeling and Krieg, Stefan and Kleesiek, Jens and Stiefelhagen, Rainer},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={4},
  pages={3904--3912},
  year={2025}
}

License

This project is licensed under the Apache 2.0 License.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.3

Sep 9, 2025

This version

0.0.2

Jun 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ccmetrics-0.0.2.tar.gz (17.9 kB view details)

Uploaded Jun 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ccmetrics-0.0.2-py3-none-any.whl (13.3 kB view details)

Uploaded Jun 3, 2025 Python 3

File details

Details for the file ccmetrics-0.0.2.tar.gz.

File metadata

Download URL: ccmetrics-0.0.2.tar.gz
Upload date: Jun 3, 2025
Size: 17.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for ccmetrics-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`7799c97f721a90fc65c5de41910f1b6096db82e2eb67c2a725e558c4f7468557`
MD5	`52933bd43ccce8569d565a5a1885667e`
BLAKE2b-256	`ac3820b2ad9129e7784dd036e331d0e5b15002df951d8895fdd8ea2acb2f05e0`

See more details on using hashes here.

File details

Details for the file ccmetrics-0.0.2-py3-none-any.whl.

File metadata

Download URL: ccmetrics-0.0.2-py3-none-any.whl
Upload date: Jun 3, 2025
Size: 13.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for ccmetrics-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8a5a0fc914ac233ba08fcd1e72de4bf73e45bdf58dbd28b9d296f21f873caa78`
MD5	`61dbf234c3322d421e2561bc94ba5f23`
BLAKE2b-256	`44d92226e1ff9850d1fa0c57942530e445db389222f205abcdbcd370f3507ec3`

See more details on using hashes here.

CCMetrics 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CC-Metrics

Every Component Counts: Rethinking the Measure of Success for Medical Semantic Segmentation in Multi-Instance Segmentation Tasks

Description

Table of Contents

Installation

Prerequisites

How to Use CC-Metrics

Basic Usage

Supported Metrics

Metric Aggregation

Caching Mechanism

Benefits of Caching

Using the Caching Feature

Precomputing Cache

Advanced Examples

Evaluating Multiple Metrics on the Same Data

FAQ

Q: Why use CC-Metrics instead of traditional metrics?

Q: How does CC-Metrics handle false negatives (ground truth components with no matching predictions)?

Q: How does CC-Metrics handle false positives (predicted components with no matching ground truth)?

Q: Is multi-class segmentation supported?

Contributing

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes