Skip to main content

PyTorch code for 'FovVideoVDP': a full-referencevisual quality metric that predicts the perceptualdifference between pairs of images or videos.

Project description

FovVideoVDP: A visible difference predictor for wide field-of-view video

FovVideoVDP is a full-reference visual quality metric that predicts the perceptual difference between pairs of images and videos. Similar to popular metrics like PSNR and SSIM, it is aimed at comparing a ground truth reference video against a distorted (e.g. compressed, lower framerate) version.

However, unlike traditional quality metrics, FovVideoVDP works for videos in addition to images, accounts for peripheral acuity, works with SDR and HDR content. We model the response of the human visual system to changes over time as well as across the visual field, so we can predict temporal artifacts like flicker and judder, as well as spatiotemporal artifacts as perceived at different degrees of peripheral vision. Such a metric is important for head-mounted displays as it accounts for both the dynamic content, as well as the large field of view.

FovVideoVDP has both a PyTorch and MATLAB implementations. The usage is described below.

The details of the metric can be found in:

Mantiuk, Rafał K., Gyorgy Denes, Alexandre Chapiro, Anton Kaplanyan, Gizem Rufo, Romain Bachy, Trisha Lian, and Anjul Patney. “FovVideoVDP : A Visible Difference Predictor for Wide Field-of-View Video.” ACM Transaction on Graphics 40, no. 4 (2021): 49. https://doi.org/10.1145/3450626.3459831.

The paper, videos and additional results can be found at the project web page: https://www.cl.cam.ac.uk/research/rainbow/projects/fovvideovdp/

If you use the metric in your research, please cite the paper above.

PyTorch quickstart

Start by installing the right version of PyTorch (with CUDA if supported) by following these instructions. Then, install pyfvvdp with PyPI pip install pyfvvdp or Anaconda conda install -c gfxdisp -c conda-forge pyfvvdp. You can run fvvdp directly from the command line:

fvvdp --test test_file --ref ref_file --display standard_fhd

The test and reference files can be images or videos. The option --display specifies a display on which the content is viewed. See fvvdp_data/display_models.json for the available displays.

Note that the default installation skips the PyEXR package and uses ImageIO instead. It is recommended to separately install this package since ImageIO's handling of OpenEXR files is unreliable as evidenced here. PyEXR is not automatically installed because it depends on the OpenEXR library, whose installation is operating system specific.

See Command line interface for further details. FovVideoVDP can be also run directly from Python - see Low-level Python interface.

Table of contents

Display specification

Unlike most image quality metrics, FovVideoVDP needs physical specification of the display (e.g. its size, resolution, peak brightness) and viewing conditions (viewing distance, ambient light) to compute accurate predictions. The specifications of the displays are stored in fvvdp_data/display_models.json. You can add the exact specification of your display to this file, or create a new JSON file and pass the directory it is located in as --config-dir parameter (fvvdp command). If the display specification is unknown to you, you are encouraged to use one of the standard display specifications listed on the top of that file, for example standard_4k, or standard_fhd. If you use one of the standard displays, there is a better chance that your results will be comparable with other studies.

You specify the display by passing --display argument to the PyTorch code, or display_name parameter to the MATLAB code.

Note the the specification in display_models.json is for the display and not the image. If you select to use standard_4k with the resolution of 3840x2160 for your display and pass a 1920x1080 image, the metric will assume that the image occupies one quarter of that display (the central portion). If you want to enlarge the image to the full resolution of the display, pass --full-screen-resize {fast_bilinear,bilinear,bicubic,lanczos} option (now works with video only).

The command line version of FovVideoVDP can take as input HDR video streams encoded using the PQ transfer function (from version 1.1.4). To correctly model HDR content, it is necessary to pass a display model with EOTF="PQ", for example standard_hdr.

Custom specification

The display photometry and geometry is typically specified by passing display_name parameter to the metric. Alternatively, if you need more flexibility in specifying display geometry (size, fov, viewing distance) and its colorimetry, you can instead pass objects of the classes fvvdp_display_geometry, fvvdp_display_photo_gog for most SDR displays, and fvvdp_display_photo_absolute for HDR displays. You can also create your own subclasses of those classes for custom display specification.

HDR content

(Python command line only) You can use the metric to compare:

  • HDR video files encoded using PQ EOTF function (SMPTE ST 2084). Pass the video files as --test and --ref arguments and specify --display standard_hdr_pq.

  • OpenEXR images. The images MUST contain absolute linear colour values (colour graded values, emitted from the display). That is, if the disply peak luminance is 1000, RGB=(1000,1000,1000) corresponds to the maximum value emitted from the display. If you pass images with the maximum value of 1, the metric will assume that the images are very dark (the peak of 1 nit) and result in incorerect predictrions. You need to specify --display standard_hdr_linear to use correct EOTF.

Reporting metric results

When reporting the results of the metric, please include the string returned by the metric, such as: "FovVideoVDP v1.2.0, 75.4 [pix/deg], Lpeak=200, Lblack=0.5979 [cd/m^2], non-foveated, (standard_4k)" This is to ensure that you provide enough details to reproduce your results.

Predicted quality scores

FovVideoVDP reports image/video quality in the JOD (Just-Objectionable-Difference) units. The highest quality (no difference) is reported as 10 and lower values are reported for distorted content. In case of very strong distortion, or when comparing two unrelated images, the quality value can drop below 0.

The main advantage of JODs is that they (a) should be linearly related to the perceived magnitude of the distortion and (b) the difference of JODs can be interpreted as the preference prediction across the population. For example, if method A produces a video with the quality score of 8 JOD and method B gives the quality score of 9 JOD, it means that 75% of the population will choose method B over A. The plots below show the mapping from the difference between two conditions in JOD units to the probability of selecting the condition with the higher JOD score (black numbers on the left) and the percentage increase in preference (blue numbers on the right). For more explanation, please refer to Section 3.9 and Fig. 9 in the main paper.

The differences in JOD scores can be converted to the percentage increase in preference (or the probability selecting A over B) using the MATLAB function fvvdp_preference.

Fine JOD scale Coarse JOD scale

PyTorch

Command line interface

The main script to run the model on a set of images or videos is run_fvvdp.py, from which the binary fvvdp is created . Run fvvdp --help for detailed usage information.

For the first example, a video was downsampled (4x4) and upsampled (4x4) by different combinations of Bicubic and Nearest filters. To predict quality, you can run:

fvvdp --test example_media/aliasing/ferris-*-*.mp4 --ref example_media/aliasing/ferris-ref.mp4 --gpu 0 --display standard_fhd --heatmap supra-threshold
Original ferris wheel Quality Difference map
Bicubic ↓
Bicubic ↑
(4x4)
bicubic-bicubic 6.469 bicubic-bicubic-diff
Bicubic ↓
Nearest ↑
(4x4)
bicubic-nearest 6.328 bicubic-nearest-diff
Nearest ↓
Bicubic ↑
(4x4)
nearest-bicubic 5.923 nearest-bicubic-diff
Nearest ↓
Nearest ↑
(4x4)
nearest-nearest 5.821 nearest-nearest-diff

Low-level Python interface

FovVideoVDP can also be run through the Python interface by instatiating the pyfvvdp.fvvdp.fvvdp class. This example shows how to predict the quality of images degraded by Gaussian noise and blur.

import os
import numpy as np
import pyfvvdp
import ex_utils as utils

I_ref = pyfvvdp.load_image_as_array(os.path.join('example_media', 'wavy_facade.png'))
fv = pyfvvdp.fvvdp(display_name='standard_4k', heatmap='threshold')

# Gaussian noise with variance 0.003
I_test_noise = utils.imnoise(I_ref, np.sqrt(0.003))
Q_JOD_noise, stats_noise = fv.predict( I_test_noise, I_ref, dim_order="HWC" )

# Gaussian blur with sigma=2
I_test_blur = utils.imgaussblur(I_ref, 2)
Q_JOD_blur, stats_blur = fv.predict( I_test_blur, I_ref, dim_order="HWC" )
Original wavy-facade Quality Difference map
Gaussian noise (σ2 = 0.003) noise 9.537 noise-diff
Gaussian blur (σ = 2) blur 8.693 blur-diff

More examples can be found in these example scripts.

MATLAB

MATLAB code for the metric can be found in matlab/fvvdp.m. The full documentation of the metric can be shown by typing doc fvvdp.

The best starting point is the examples, which can be found in matlab/examples. For example, to measure the quality of a noisy image and display the difference map, you can use the code:

I_ref = imread( 'wavy_facade.png' );
I_test_noise = imnoise( I_ref, 'gaussian', 0, 0.001 );

[Q_JOD_noise, diff_map_noise] = fvvdp( I_test_noise, I_ref, 'display_name', 'standard_phone', 'heatmap', 'threshold' );

clf
imshow( diff_map_noise );

By default, FovVideoVDP will run the code on a GPU using gpuArrays, which require functioning CUDA on your computer. If you do not have GPU with CUDA support (e.g. you are on Mac), the code will automatically fallback to the CPU, which will be much slower.

Low-level MATLAB interface

fvvdp function is the suitable choice for most cases. But if you need to run metric on large datasets, you can use a low-level function fvvdp_core. It requires as input an object of the class fvvdp_video_source, which supplies the metric with the frames. Refer to the documentation of that class for further details.

Differences between MATLAB and Pytorch versions

  • Both versions are implementation of the same metric, but due to differences in the video loaders, you can expect to see small differences in their predictions - typically up to 0.05 JOD.
  • PyTorch version is a bit faster when running on a GPU.

Release notes

  • v1.2.0 - 19 April 2023

    • The python command line interface can now accept HDR video files and OpenEXR images.
    • PU21-PSNR can be run in addition to FovVideoVDP
    • Added new command line options: --full-screen-resize, --metrics, --temp-padding, --feature, --output-dir
    • Faster decoding for large (4k) videos when run on CUDA (python, the command line interface fvvdp).
    • The prediction may differ slightly because of the way video files are handled and processed. There are no changes in the metric.
    • Fixed issue when reading video frames in python freezed on some platforms (issue with python's /dev/null redirection)
  • v1.1.3 - 18 October 2022

    • Added "raw" heatmap type to the command line.
    • Changed the way pyfvvdp classes are imported to avoid clash between the file and class names (see updated pytortch_examples)
    • Installation of PyEXR is now optional (caused problems on some operating systems).
  • v1.1.2 - 23 September 2022

    • Updated Python dependencies - now works with earlier versions of PyTorch, Numpy and SciPy
  • v1.1.1 - 28 August 2022

    • We found a small inconsistency in eccentricity calculations. After fixing this, the metric has been retrained on the same datasets as described in the paper. FovVideoVDP v1.1 will return JOD values that are different than v1.0. For that reason, it is important to mention the version number when reporting the results.
    • Python interface has been thoroughly redesigned and make more consistent with Matlab's conterpart. Now it should be much easier to call the metric from Python.
    • All Matlab examples have been ported to Python.
    • Python version is now faster.
    • Published as a PIP repository.
  • v1.0 - 21 April 2021

    • The original FovVideoVDP release, released with the paper.

The detailed list of changes can be found in ChangeLog.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyfvvdp-1.2.0.tar.gz (264.1 kB view details)

Uploaded Source

File details

Details for the file pyfvvdp-1.2.0.tar.gz.

File metadata

  • Download URL: pyfvvdp-1.2.0.tar.gz
  • Upload date:
  • Size: 264.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for pyfvvdp-1.2.0.tar.gz
Algorithm Hash digest
SHA256 83a8e7e1681211edcc1ec5083bf4c50ea8e31442e6550dd76beacd9e3a31ad0f
MD5 3f900c459a45b76c0b6e92cc7e58252f
BLAKE2b-256 c011e99e42939a2178aa35619e633910b3774484e7b94e20d86957aa590b830f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page