Skip to main content

Gamified image segmentation quality estimation

Project description

The segmentation game - for napari

License PyPI Python Version tests codecov napari hub DOI

Gamified image segmentation quality estimation

img.png


Usage

The Segmentation Game allows to quantitatively compare segmentation results to a given ground truth annotation. This allows fine-tuning parameters of image processing workflows to get optimal segmentation quality. It also allows comparing different segmentation algorithms and identify which algorithm performs best objectively.

The game can be found in napari's Tools > Games > The Segmentation Game menu.

Ground Truth Annotation

Before you can start playing the game, some annotated cells/nuclei are necessary to later compute segmentation quality from. Depending on the used metric, it might be sufficient to annotate a hand full of objects. Use napari's annotation tools as shown below. Use the + and - keys on your keyboard to increase and decrease the label number that is currently drawn. Note: Avoid label gaps. The labels must be continuously subsequent. If there are pixels annotated with value 2, there must be pixels annotated with value 1.

Parameter tuning

If you work with one of napari's segmentation plugins that produce labels layers, you can tune their parameters and see how this influences segmentation quality.

Segmentation algorithm comparison

If you aim at comparing different segmentation algorithms, you can collect their results in label layers in the napari viewer. You can then select the segmentation result from the corresponding pulldown and save quantitative comparison results in the Highscore table.

Metrics

Currently, these metrics are implemented:

  • Jaccard Index (sparse): The Jaccard Index is a measure of overlap. It lies between 0 (no overlap) and 1 (perfect overlap). For each annotated ground truth label, the maximum overlap of any segmented label is determined. The mean overlap of all annotated labels serves as metric result.
  • Jaccard Index (binary): The annotated ground truth labels and the segmentation result are first binarized considering all annotated pixels as positive and all other labels as negative. Afterwards, the overlap between the two binary images is computed. This allows comparing binary segmentation algorithms, such as thresholding techniques.
  • Jaccard Index (binary, sparse): The annotated ground truth image can contain values 1 (negative, false) and 2 (positive, true) and the segmentation result image will be binarized (0: False, otherwise: True). This allows comparing object/no-object annotations with label images.

Receiver operating characteristic (ROC)

Consider a two-class thresholding problem (binary pixel-wise classification object/background), in which the outcomes are labeled either as positive (p) or negative (n). There are four possible outcomes from a binary classifier. If the outcome from a prediction is p and the actual value is also p, then it is called a true positive (TP); however if the actual value is n then it is said to be a false positive (FP). Conversely, a true negative (TN) has occurred when both the prediction outcome and the actual value are n, and false negative (FN) is when the prediction outcome is n while the actual value is p. We can organize result in table called confusion matrix, based on positive/neagtive results in row and true and false result in columns. From the confucsion matrix we can get many metrics with various usefulness. The curently implemented used for classification evaluation are:

  • Sensitivity, recall, hit rate, or true positive rate (TPR): (TP)/ (TP + FP), Sensitivity and specificity mathematically describe the accuracy of a test which reports the presence or absence of a condition. Individuals for which the condition is satisfied are considered "positive" and those for which it is not are considered "negative".
  • Specificity, selectivity or true negative rate (TNR): (TN)/ (TN + FN), Sensitivity and specificity mathematically describe the accuracy of a test which reports the presence or absence of a condition. Individuals for which the condition is satisfied are considered "positive" and those for which it is not are considered "negative".
  • Precision or positive predictive value (PPV): (TP)/ (TP + FP), in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. It is quantification for the TP events.
  • Accuracy: (TP + TN)/ (TP + FP + TN + FN), Accuracy measures observational error. Accuracy is how close or far off a given set of measurements are to their true value. However, it usually fails in imbalanced sets.
  • Balanced Accuracy: (TP/(TP+FN) + TN/(TN+FP))/2, Balanced Accuracy is trying to even out problems of accuracy in imbalanced sets.
  • F1 Score: 2TP/(2TP + FP + TN + FN), In statistical analysis of binary classification, the F-score or F-measure is a measure of a test's accuracy. It is calculated from the precision and recall of the test, where the precision is the number of true positive results divided by the number of all positive results, including those not identified.
  • Threat score (TS) or critical success index (CSI): TP/(TP + FP + FN), TC is another name for Jaccard Index (binary).

The ROC measures or confusion matrix is invaluable in cases when when our binary classifier is not ideal (which is often) and we are aiming to not get a general good result but specified low error. In that case we usually need to decide for some trade off, for example we need all (as many as possible) classified true positive objects, but we do not mind getting (usually as few as possible) false positive objects.

What we want to achieve

Precision-versus-accuracy, source: 10.13140/RG.2.1.1668.7603

When we are doing semantic segmentation, we are aiming to classify each pixel (ideally correctly) to each of our classes. But that can be hugr ammount of information, and our object might have significantly much less pixels then number of pixels belonging to background and/or other classes. Before choosing right metrics, we need to set up goal for our classification results. Idealy, we would like to have high accuracy and precission for ach class (as is on pictur above), but we might be happy getting high accuracy with good precision. Realisticaly we might need to be more specific, as to choose how big error we are prepared to accept, or decide if it is acceptable to have FN findings but no FP.

Picking up a metric for highly unbalanced classification as in semantic segmentation is challenging. Most of the classic metrics wil fail (but they are stil usable object-wise). And we usually stick up with Jaccard Index/Threat score, F1 Score or anything that will tell us result for TP rate (as we expect we will have less pixels for objects then background and/or other classes).

Literature recommendation

How to choose the right metric for comparing segmentation results is explained in this paper:

Related plugins

If you aim at automatically optimizing segmentation quality, there are also napari plugins available with this capability:

Installation

You can install the-segmentation-game via pip:

pip install the-segmentation-game

Contributing

Contributions - especially new image segmentation quality metrics - are very welcome. Tests can be run with tox, please ensure the coverage at least stays the same before you submit a pull request.

License

Distributed under the terms of the BSD-3 license, "the-segmentation-game" is free and open source software

Issues

If you encounter any problems, please open a thread on image.sc along with a detailed description.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

the-segmentation-game-0.2.0.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

the_segmentation_game-0.2.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file the-segmentation-game-0.2.0.tar.gz.

File metadata

  • Download URL: the-segmentation-game-0.2.0.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for the-segmentation-game-0.2.0.tar.gz
Algorithm Hash digest
SHA256 4bbf4deadecb84efc893c6cea2d51e90374181589f2034050aa3cf8fa65872f3
MD5 a07d85ef807eb369faf767424e2f7b1f
BLAKE2b-256 d1782e494b5ea97d4ed3b41b9ac05e0a0f9c6f1d525fc721ba9fa5eb2fa5250f

See more details on using hashes here.

File details

Details for the file the_segmentation_game-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for the_segmentation_game-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6e6cd5f9e8dcc258687e07d2a85e243e188288ceaa5600d67d0351c2ada28cca
MD5 b7a0bcbe3fe4e1ba5cec490da597aec5
BLAKE2b-256 92d285b5f6222ac33923e7acc272a1c8db07b6bd9d8d3b4401b29b1bba0d1f91

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page