Evaluation Framework for DAVIS Interactive Segmentation

## Project description

# DAVIS Interactive Evaluation Framework

This is a framework to evaluate interactive segmentation models over the [DAVIS 2017](http://davischallenge.org/index.html) dataset. The code aims to provide an easy-to-use interface to test and validate interactive segmentation models.

This is the tool that will be used to evaluate the interactive track on the DAVIS Challenge on Video Object Segmentation 2018. More info about the challenge on the [website](http://davischallenge.org/challenge2018/interactive.html).

You can find an example of how to use the package in the following repository:

* [Scribble-OSVOS](https://github.com/kmaninis/Scribble-OSVOS)

## DAVIS Scribbles

In the DAVIS **Main** Challenge track, the task consists on object segmentation in a *semi-supervised* manner, i.e. the given input is the ground truth mask of the first frame. In the DAVIS **Interactive** Challenge, in contrast, the user input is in form of scribbles, which can be drawn faster by humans and thus is a more realistic type of input.

<img src="docs/images/scribbles/dogs-jump-image.jpg" width="30%"/> <img src="docs/images/scribbles/dogs-jump-scribble01.jpg" width="30%"/> <img src="docs/images/scribbles/dogs-jump-scribble02.jpg" width="30%"/>

The interactive annotation and segmentation consist in an iterative loop which is going to be evaluated as follows:

* On the first iteration, a human-annotated scribble will be provided to the segmentation model. All the scribbles are annotated over the DAVIS 2017 dataset and the objects annotated will be the same as the ground truth masks.<br> **Note**: the annotated frame can be any of the sequence as the humans where asked to annotate the frames that found most relevant and meaningfull to annotate.
* During the rest of the iterations, once the predicted masks have been submitted, a scribble is simulated by the server. The new annotation will be performed on a single frame and this frame will be chosen as the one on which the current result is the worst.

**Evaluation**: For now, the evaluation metric will be the Jaccard similarity $\mathcal{J}$.

## Citation

tex
@article{Caelles_arXiv_2018,
author = {Sergi Caelles and Alberto Montes and Kevis-Kokitsi Maninis and Yuhua Chen and Luc {Van Gool} and Federico Perazzi and Jordi Pont-Tuset},
title = {The 2018 DAVIS Challenge on Video Object Segmentation},
journal = {arXiv:1803.00557},
year = {2018}
}


latex
@article{Pont-Tuset_arXiv_2017,
author = {Jordi Pont-Tuset and Federico Perazzi and Sergi Caelles and Pablo Arbel\'aez and Alexander Sorkine-Hornung and Luc {Van Gool}},
title = {The 2017 DAVIS Challenge on Video Object Segmentation},
journal = {arXiv:1704.00675},
year = {2017}
}