Skip to main content

BM3D denoising filter for VapourSynth, implemented in CUDA

Project description

VapourSynth-BM3DCUDA

This package contains the CUDA implementation of the VapourSynth-BM3DCUDA filter.

Installation

pip install vapoursynth-bm3dcuda

Building from source

Requirements

  • CUDA Toolkit: 12.8 or newer.

  • C++ Compiler: A C++20 compatible compiler.

    • Windows: Visual Studio 2022 or newer.
    • Linux: GCC
  • Windows

    $env:CUDA_PATH = "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0"
    uv build --package vapoursynth-bm3dcuda
    
  • Linux

    export CUDA_PATH=/usr/local/cuda-12.8
    uv build --package vapoursynth-bm3dcuda
    

Detailed parameter information from the parent project follows.


VapourSynth-BM3DCUDA

Copyright© 2021 WolframRhodium

BM3D denoising filter for VapourSynth, implemented in CUDA.

Description

  • Please check VapourSynth-BM3D.

  • The _rtc version compiles GPU code at runtime, which might runs faster than standard version at the cost of a slight overhead.

  • The cpu version is implemented in AVX and AVX2 intrinsics, serves as a reference implementation on CPU. However, bitwise identical outputs are not guaranteed across CPU and CUDA implementations.

Requirements

  • CPU with AVX support.

  • CUDA-enabled GPU(s) of compute capability 5.0 or higher (Maxwell+).

  • GPU driver 450 or newer.

The minimum requirement on compute capability is 3.5, which requires manual compilation (specifying nvcc flag -gencode arch=compute_35,code=sm_35).

The cpu version does not require any external libraries but requires AVX2 support on CPU in addition.

Parameters

{bm3dcuda, bm3dcuda_rtc, bm3dcpu}.BM3D(clip clip[, clip ref=None, float[] sigma=3.0, int[] block_step=8, int[] bm_range=9, int radius=0, int[] ps_num=2, int[] ps_range=4, bint chroma=False, int device_id=0, bool fast=True, int extractor_exp=0])
  • clip:

    The input clip. Must be of 32 bit float format. Each plane is denoised separately if chroma is set to False. Data of unprocessed planes is undefined. Frame properties of the output clip are copied from it.

  • ref:

    The reference clip. Must be of the same format, width, height, number of frames as clip.

    Used in block-matching and as the reference in empirical Wiener filtering, i.e. bm3d.Final / bm3d.VFinal:

    basic = core.{bm3dcpu, bm3dcuda, bm3dcuda_rtc}.BM3D(src, radius=0)
    final = core.{bm3d...}.BM3D(src, ref=basic, radius=0)
    
    vbasic = core.{bm3d...}.BM3D(src, radius=radius_nonzero).bm3d.VAggregate(radius=radius_nonzero)
    vfinal = core.{bm3d...}.BM3D(src, ref=vbasic, radius=r).bm3d.VAggregate(radius=r)
    
    # alternatively, using the v2 interface
    basic_or_vbasic = core.{bm3dcpu, bm3dcuda, bm3dcuda_rtc}.BM3Dv2(src, radius=r)
    final_or_vfinal = core.{bm3d...}.BM3Dv2(src, ref=basic_or_vbasic, radius=r)
    

    corresponds to the followings (ignoring color space handling and other differences in implementation), respectively

    basic = core.bm3d.Basic(clip)
    final = core.bm3d.Final(basic, ref=basic)
    
    vbasic = core.bm3d.VBasic(src, radius=r).bm3d.VAggregate(radius=r, sample=1)
    vfinal = core.bm3d.VFinal(src, ref=vbasic, radius=r).bm3d.VAggregate(radius=r)
    
  • sigma: The strength of denoising for each plane.

    The strength is similar (but not strictly equal) as VapourSynth-BM3D due to differences in implementation. (coefficient normalization is not implemented, for example)

    Default [3,3,3].

  • block_step, bm_range, radius, ps_num, ps_range:

    Same as those in VapourSynth-BM3D.

    If chroma is set to True, only the first value is in effect.

    Otherwise an array of values may be specified for each plane (except radius).

    Note: It is generally not recommended to take a large value of ps_num as current implementations do not take duplicate block-matching candidates into account during temporary searching, which may leads to regression in denoising quality. This issue is not present in VapourSynth-BM3D.

    Note2: Lowering the value of "block_step" will be useful in reducing blocking artifacts at the cost of slower processing.

  • chroma:

    CBM3D algorithm. clip must be of YUV444PS format.

    Y channel is used in block-matching of chroma channels.

    Default False.

  • device_id:

    Set GPU to be used.

    Default 0.

  • fast:

    Multi-threaded copy between CPU and GPU at the expense of 4x memory consumption.

    Default True.

  • extractor_exp:

    Used for deterministic (bitwise) output. This parameter is not present in the cpu version since the implementation always produces deterministic output.

    Pre-rounding is employed for associative floating-point summation.

    The value should be a positive integer not less than 3, and may need to be higher depending on the source video and filter parameters.

    Default 0. (non-determinism)

Notes

  • bm3d.VAggregate should be called after temporal filtering, as in VapourSynth-BM3D. Alternatively, you may use the BM3Dv2() interface for both spatial and temporal denoising in one step.

  • The _rtc version has three additional experimental parameters:

    • bm_error_s: (string)

      Specify cost for block similarity measurement.

      Currently implemented costs: SSD (Sum of Squared Differences), SAD (Sum of Absolute Differences), ZSSD (Zero-mean SSD), ZSAD (Zero-mean SAD), SSD/NORM.

      Default SSD.

    • transform_2d_s/transform_1d_s: (string)

      Specify type of transform.

      Currently implemented transforms: DCT (Discrete Cosine Transform), Haar (Haar Transform), WHT (Walsh–Hadamard Transform), Bior1.5 (transform based on a bi-orthogonal spline wavelet).

      Default DCT.

    These features are not implemented in the standard version due to performance and binary size concerns.

Statistics

GPU memory consumptions:

(ref ? 4 : 3) * (chroma ? 3 : 1) * (fast ? 4 : 1) * (2 * radius + 1) * size_of_a_single_frame

Compilation

  • The CMake configuration of BM3DCUDA_RTC links to NVRTC static library by default, which requires CUDA 11.5 or later.
cmake -S . -B build -D CMAKE_BUILD_TYPE=Release -D CMAKE_CUDA_FLAGS="--threads 0 --use_fast_math -Wno-deprecated-gpu-targets" -D CMAKE_CUDA_ARCHITECTURES="50;61-real;75-real;86"

cmake --build build --config Release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vapoursynth_bm3dcuda-2.17.dev1.tar.gz (625.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

vapoursynth_bm3dcuda-2.17.dev1-py3-none-win_amd64.whl (43.8 MB view details)

Uploaded Python 3Windows x86-64

vapoursynth_bm3dcuda-2.17.dev1-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (57.2 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

vapoursynth_bm3dcuda-2.17.dev1-py3-none-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl (53.8 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ ARM64manylinux: glibc 2.28+ ARM64

File details

Details for the file vapoursynth_bm3dcuda-2.17.dev1.tar.gz.

File metadata

  • Download URL: vapoursynth_bm3dcuda-2.17.dev1.tar.gz
  • Upload date:
  • Size: 625.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vapoursynth_bm3dcuda-2.17.dev1.tar.gz
Algorithm Hash digest
SHA256 a4314d1e464b2bff974cab29d74c5d1645ae284d467643e6b9242b88e2e725e0
MD5 99efd4e47f555bddf48ed526d47769dd
BLAKE2b-256 b0fd14ae42092e654166e740e3579913e9732ccd8fd98d9980324bd4a8445b31

See more details on using hashes here.

Provenance

The following attestation bundles were made for vapoursynth_bm3dcuda-2.17.dev1.tar.gz:

Publisher: publish.yml on Jaded-Encoding-Thaumaturgy/vs-wheels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vapoursynth_bm3dcuda-2.17.dev1-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for vapoursynth_bm3dcuda-2.17.dev1-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 bd983f4a617b8fb033d40d16710818cdb4af9bd89ed150a3d923eb9b370245ee
MD5 6cedd7b6ad54a59441960d3715d320a1
BLAKE2b-256 ae6baf5151ea1b9f26791c3d365ca835b8ed941d7d00728e526afbc1e45cd98f

See more details on using hashes here.

Provenance

The following attestation bundles were made for vapoursynth_bm3dcuda-2.17.dev1-py3-none-win_amd64.whl:

Publisher: publish.yml on Jaded-Encoding-Thaumaturgy/vs-wheels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vapoursynth_bm3dcuda-2.17.dev1-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for vapoursynth_bm3dcuda-2.17.dev1-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 be642bbc1a1011bcb41504686e5cb1780c0013804d993563942ef0ff0b447824
MD5 1a7399b104740a6092bf07b2256dcb12
BLAKE2b-256 9fe5e2161de82a560f46a72c6c433e64a59f575e1bda07ab753af774e5b572b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for vapoursynth_bm3dcuda-2.17.dev1-py3-none-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on Jaded-Encoding-Thaumaturgy/vs-wheels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vapoursynth_bm3dcuda-2.17.dev1-py3-none-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for vapoursynth_bm3dcuda-2.17.dev1-py3-none-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6c9c65ace35be6693e29d8fb0b7be6ad88446843fb5008455f6e81956cfbb1f0
MD5 db445afd3c61c1420a6944f310cd3b7b
BLAKE2b-256 c077509a42259844fe13a4734afa46c56e0093af7faadbc53ae071dfc2c82b72

See more details on using hashes here.

Provenance

The following attestation bundles were made for vapoursynth_bm3dcuda-2.17.dev1-py3-none-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl:

Publisher: publish.yml on Jaded-Encoding-Thaumaturgy/vs-wheels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page