Skip to main content

General multi-threaded benchmarking script for video object segmentation.

Project description

Simple Vidoe Object Segmentation Benchmarking

Quick Start

Installation

Locally (recommended):

git clone https://github.com/hkchengrex/vos-benchmark
pip install -e vos-benchmark

Via pip (as a library):

pip install vos-benchmark

Using it as a script:

(from the vos-benchmark root directory)

python benchmark.py -g <path to ground-truth directory> -m <path to prediction directory> -n <number of processes, 16 by default>

Using it as a library:

from vos_benchmark.benchmark import benchmark

# both arguments are passed as a list -- multiple datasets can be specified
benchmark([path to ground-truth directory], [<path to prediction directory>])

See benchmark.py for an example, and see vos_benchmark/benchmark.py for the function signature with additional options.

A results.csv will be saved in the prediction directory.

Background

I built this tool to accelerate evaluations (J&F) on different video object segmentation benchmarks. Previously, davis2017-evaluation is used, which has several limitations:

  • It is slow. Evaluating predictions on the validation set of DAVIS-2017 (30 videos) takes 73.3 seconds. It would take longer for any larger dataset. Ours takes 5.36 seconds (with 16 threads).
  • It is tailored to the DAVIS dataset. Evaluating on other datasets (like converted OVIS, UVO, or the long video dataset) requires mocking them as DAVIS (setting up "split" text files and following a non-trivial file structure). We don't care. We just take the paths to two folders (ground-truth and predictions) as input.
  • It does not work with non-continuous object IDs. We do.

I have tested this script on DAVIS-16/17 and confirmed that it produces identical results as the official evaluation script.

Technical Details / Troubleshooting

  1. This benchmarking script is simple and dumb. It does not intelligently resolve input problems. If something does not work, most likely the input is problematic. Garbage in, garbage out. Check your input (see below). We read the input masks using Image.open from PIL. Paletted png files and grayscale png should both work.
  2. We determine the objects in a frame (ground-truth or prediction) by running np.unique. If there are any types of antialiasing, blurring, smoothing, etc., that spawn new pixel values, this will not work.
  3. From the start of the video, we keep a list of all objects that are seen in either ground-truth or prediction. This is to support datasets where some ground-truth objects appear later in the frame. Predicting objects that are not in the ground-truth harms the final score.
  4. By default, we skip the first and the last frame during evaluation. This is in line with standard semi-supervised video object segmentation evaluation protocol as in DAVIS. This can be overridden by specifiying -d or --do_not_skip_first_and_last_frame, or passing skip_first_and_last=False.
  5. By default, we don't care if all the videos in the ground-truth folder have corresponding predictions. This is to support datasets that contain videos from different splits (e.g., DAVIS puts train/val splits together) in a single folder. If the prediction only contains videos from the validation set, we would only evaluate those videos. This can be overridden by specifying -s or --strict, or passing strict=True. In the strict mode, an exception would be thrown if the sets of videos do not match.
  6. If a video is being evaluated, all the frames in the ground-truth folder must have corresponding predictions. Predictions that do not have corresponding predictions are simply ignored.

Related projects:

Official DAVIS 2017 evaluation implementation: https://github.com/davisvideochallenge/davis2017-evaluation

BURST benchmark (evaluates HOTA which is not supported here): https://github.com/Ali2500/BURST-benchmark

TrackEval (a powerful tool with more functionalities): https://github.com/JonathonLuiten/TrackEval

My video object segmentation projects:

XMem, latest: https://github.com/hkchengrex/XMem

STCN: https://github.com/hkchengrex/STCN

MiVOS: https://github.com/hkchengrex/MiVOS

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vos_benchmark-0.1.0.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

vos_benchmark-0.1.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file vos_benchmark-0.1.0.tar.gz.

File metadata

  • Download URL: vos_benchmark-0.1.0.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for vos_benchmark-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c4bc8c3b31cfee757c9312c47df73a5197ce2f48eff67f0915949633959e36dc
MD5 5d7a98273239d0400aaa19761539eaa7
BLAKE2b-256 4483d33297abd2c86141f865351e7e667f369593e6e4dd17ae86304fdcc83127

See more details on using hashes here.

File details

Details for the file vos_benchmark-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for vos_benchmark-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 27c0b5cc34233b64082e706d5653d36edfbdd17d6e3dae90591146648003f035
MD5 b9bc8152ec7e38465200e98be290bcf5
BLAKE2b-256 ec4c01cedf4277ef0b90bea042f8765a7e450231c03936ec967bbfe2c95c20d1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page