Skip to main content

Compressed Video Quality Assessment Tool

Project description

FDIM

arXiv

Introduction

FDIM is a feature-distance-based video quality assessment (VQA) metric designed to generalize across:

  • Traditional and neural codecs
  • SDR and HDR formats
  • Diverse resolutions and content types

FDIM uses a hybrid architecture consisting of:

  • Deep branch: learns multi-scale representations to capture distortions ranging from low-level fidelity degradation to high-level semantic differences, using a content-adaptive feature comparison mechanism.
  • Hand-crafted branch: improves robustness and generalization across domains.

FDIM is trained on the large-scale DCVQA dataset (16k+ samples covering both conventional and neural codecs) and delivers strong, consistent performance across multiple public SDR and HDR VQA benchmarks.

FDIM overview

SDR radar results HDR radar results

Bubble plot results

The package supports quality evaluation for either a single compressed video or a collection of compressed videos in YUV or RGB format.

Citation

If you find this work useful, please cite:

@misc{wang2026fdimfeaturedistancebasedgenericvideo,
  title={FDIM: A Feature-distance-based Generic Video Quality Metric for Versatile Codecs},
  author={Jiayi Wang and Lichun Zhang and Xiaoqi Zhuang and Jiaqi Zhang and Lu Yu and Yin Zhao},
  year={2026},
  eprint={2604.24123},
  archivePrefix={arXiv},
  primaryClass={cs.CV},
  url={https://arxiv.org/abs/2604.24123}
}

Installation

Prerequisite

  • [verified] CUDA 12.2
  • python 3.9
  • ffmpeg in PATH

Setup Virtualenv

conda create -n fdim python=3.9.20
conda activate fdim

Install packages for inference

Install from PyPI:

pip install fdim

The package provides a command line entry point named fdim. The default checkpoint and VMAF executable are included in the package.

For local development, run the following command from the repository root:

pip install -e .

You can also use:

bash install.sh

Note: install.sh will create and activate a conda environment named fdim automatically. If you already created and activated the environment manually, prefer pip install -e . to avoid duplicated setup.

Publish to PyPI

Build and upload the package from the repository root:

python -m pip install build twine
python -m build
python -m twine upload dist/*

After upload, users can install it with pip install fdim and run fdim --help.

Instruction to run FDIM

1. Download the chekpoint/model

The default checkpoint path used by the scripts is:

  • fdim: put it in ./fdim/dist/checkpoints/

The current repository uses ./fdim/dist/checkpoints/dist_5.0.0.ckpt by default. If you want to use another checkpoint, pass it explicitly with --model_path <path_to_checkpoint>.

2. Prepare video information file

Create a CSV file in ./data/dataset/ and enter the information of all the video you want to evaluate as follow:

ref_name dis_name mos ref_width ref_height dis_width dis_height ref_bits dis_bits
SRC1001_1920x1080_25_yuv420p.mp4 SRC1001_1920x1080_25_yuv420p.mp4.x265.r0.265.mp4 4.854890404 1920 1080 1920 1080 8 8
  • ref_name: The name of the reference video.
  • dis_name: The name of the test video.
  • mos: The ground truth of video quality. If unavailable, set it to 0.
  • ref_width, ref_height: The video resolution of reference video.
  • dis_width, dis_height: The video resolution of distorted video.

If your yuv videos are 8bit, you don't need the "ref_bits" and "dis_bits" columns.

3. Inference

Evaluate the quality of all videos in a dataset

After pip install fdim, use the batch CLI:

fdim batch \
    --save_dir ./data/result \
    --save_name fdim_test \
    --ref_dir <path_to_reference_video_dir> \
    --dis_dir <path_to_distorted_video_dir> \
    --csv_path <path_to_csv_file> \
    --ref_fmt rgb \
    --dis_fmt rgb \
    --preprocess none \
    --video_temp_path ./data/video_temp/ \
    --gpu_idx 0

If you run from source with the legacy scripts, make sure the VMAF executable has execute permission:

chmod +x ./fdim/vmaf/vmaf
  1. If the reference video and distorted video is not in YUV format.

    python dataset_test.py \
        --metric fdim \
        --save_dir ./data/result \
        --save_name fdim_test \
        --ref_dir <path_to_reference_video_dir> \
        --dis_dir <path_to_distorted_video_dir> \
        --csv_path <path_to_csv_file> \
        --ref_fmt rgb \
        --dis_fmt rgb \
        --preprocess none \
        --video_temp_path ./data/video_temp/ \
        --gpu_idx 0
    
  2. If the reference video is in YUV format, --ref_width_column , --ref_height_column and --ref_fmt must be provided, if bit_depth is not 8,--ref_bit_depth_column must be provided.

    If the distorted video is in YUV format, -dis_width_column , --dis_height_column and --dis_fmt must be provided, if bit_depth is not 8,--dis_bit_depth_column must be provided.

    python dataset_test.py \
        --metric fdim \
        --save_dir data/result \
        --save_name fdim_eem_sample \
        --csv_path <path_to_csv_file> \
        --ref_dir <path_to_reference_video_dir> \
        --dis_dir  <path_to_distorted_video_dir> \
        --ref_column <reference video name column in csv file> \
        --dis_column <distorted video name column in csv file> \
        --ref_fmt <reference video format column in csv file> \
        --dis_fmt <distorted video format column in csv file> \
        --ref_width_column <reference video width column in csv file> \
        --ref_height_column <reference video height column in csv file> \
        --dis_width_column <distorted video width column in csv file> \
        --dis_height_column <distorted video height column in csv file> \
        --ref_bit_depth_column <reference video bitdepth column in csv file> \
        --dis_bit_depth_column <distorted video bitdepth column in csv file> \
        --video_temp_path ./data/video_temp/ \
        --gpu_idx 0
    

Evaluate the quality of a test video

After pip install fdim, use the single-video CLI:

fdim single \
    --ref <ref_video_path> \
    --dis <dis_video_path> \
    --video_temp_path ./data/video_temp/ \
    --gpu_idx 0
  1. If the reference video and distorted video is not in YUV format.

    python single_test.py --metric fdim --ref_video_root <ref_video_path> --dis_video_root <dis_video_path> --video_temp_path ./data/video_temp/ --gpu_idx 0

  2. If the reference/distorted video and distorted video is in YUV format.

    python single_test.py \
        --metric fdim \
        --ref_video_root <ref_video_path> \
        --dis_video_root <dis_video_path> \
        --ref_fmt <reference video format, such as yuv420p, yuv420p10le> \
        --dis_fmt <distorted video format, such as yuv420p, yuv420p10le> \
        --ref_width <reference video width> \
        --ref_height <reference video height> \
        --dis_width <distorted video width> \
        --dis_height <distorted video height> \
        --ref_bit_depth <reference video bitdepth> \
        --dis_bit_depth <distorted video bitdepth> \
        --video_temp_path ./data/video_temp/ \
        --gpu_idx 0
    

Inference HDR content

For PQ/HLG content, enable PU21 preprocessing (--preprocess pu21) and select (or customize) the correct display model (--display_model <name>). The available display model definitions are stored in fdim/dist/pycvvdp/vvdp_data/display_models.json.

If --display_model is not provided while --preprocess pu21 is enabled, the code uses standard_hdr_pq_tv by default.

Reference notes:

  • pycvvdp in this repository is a vendored third-party module adapted from ColorVideoVDP, which provides the display model and video source utilities used by the HDR preprocessing path.
  • PU21 refers to the perceptually uniform HDR encoding proposed in "PU21: A novel perceptually uniform encoding for adapting existing quality metrics for HDR" and is used here through the integrated pycvvdp implementation.

Example:

python single_test.py \
    --metric fdim \
    --ref_video_root /path/to/ref.mp4 \
    --dis_video_root /path/to/dis.mp4 \
    --preprocess pu21 \
    --display_model standard_hdr_pq_tv \
    --video_temp_path ./data/video_temp/ \
    --gpu_idx 0

Low-complexity implementation for 4K videos

If your input videos are 4K and you want faster inference, set the resolution parameter --input_resolution 1080 to downsample frames before the deep model. In our experiments, this significantly improves runtime while only slightly reducing objective-subjective consistency.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fdim-0.2.1.tar.gz (80.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fdim-0.2.1-py3-none-any.whl (80.3 MB view details)

Uploaded Python 3

File details

Details for the file fdim-0.2.1.tar.gz.

File metadata

  • Download URL: fdim-0.2.1.tar.gz
  • Upload date:
  • Size: 80.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for fdim-0.2.1.tar.gz
Algorithm Hash digest
SHA256 bace1e7bd26dd9a38ceb2db7bc61846e676aba6bd56c9d706bab401bc47abb9c
MD5 c955c57cae075d11574023520bfa7a48
BLAKE2b-256 846df4a632920eb1d53ab389b2da4a5849155f36efaa7cda3cfc1c3f3d7db240

See more details on using hashes here.

File details

Details for the file fdim-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: fdim-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 80.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for fdim-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 02c3343c8b982f9f9276eb919d83219132e5504c9011bf22b2761edac711689c
MD5 39880d8dd61ab1cf21897f008ae0440e
BLAKE2b-256 bd1ef8129e7a70cb7ca03de3694e9462ca1710f1a76809d87b17cd1abd087960

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page