Skip to main content

BM3D denoising filter for VapourSynth, implemented in AVX2

Project description

VapourSynth-BM3DCPU

This package contains the CPU implementation of the VapourSynth-BM3DCUDA filter.

Installation

Install the multi-target wheel that includes binaries for multiple architectures (AVX2 and Zen4)

pip install vapoursynth-bm3dcpu

To compile and install the package with optimizations tailored to your specific CPU:

  • Windows within the developer shell:

    pip install vapoursynth-bm3dcpu --no-binary vapoursynth-bm3dcpu -C "cmake.define.CMAKE_CXX_COMPILER=clang++.exe"
    
  • Linux:

    pip install vapoursynth-bm3dcpu --no-binary vapoursynth-bm3dcpu
    

Building from source

Requirements

  • C++ Compiler: C++17 compatible.
    • Windows: MSVC (Visual Studio 2019+) or Clang.
    • Linux: GCC or Clang.
uv build --package vapoursynth-bm3dcpu -C "cmake.define.CMAKE_CXX_COMPILER=clang++.exe" -C "cmake.define.BM3D_MULTI_TARGET=ON"

Detailed parameter information from the parent project follows.


VapourSynth-BM3DCUDA

Copyright© 2021 WolframRhodium

BM3D denoising filter for VapourSynth, implemented in CUDA.

Description

  • Please check VapourSynth-BM3D.

  • The _rtc version compiles GPU code at runtime, which might runs faster than standard version at the cost of a slight overhead.

  • The cpu version is implemented in AVX and AVX2 intrinsics, serves as a reference implementation on CPU. However, bitwise identical outputs are not guaranteed across CPU and CUDA implementations.

Requirements

  • CPU with AVX support.

  • CUDA-enabled GPU(s) of compute capability 5.0 or higher (Maxwell+).

  • GPU driver 450 or newer.

The minimum requirement on compute capability is 3.5, which requires manual compilation (specifying nvcc flag -gencode arch=compute_35,code=sm_35).

The cpu version does not require any external libraries but requires AVX2 support on CPU in addition.

Parameters

{bm3dcuda, bm3dcuda_rtc, bm3dcpu}.BM3D(clip clip[, clip ref=None, float[] sigma=3.0, int[] block_step=8, int[] bm_range=9, int radius=0, int[] ps_num=2, int[] ps_range=4, bint chroma=False, int device_id=0, bool fast=True, int extractor_exp=0])
  • clip:

    The input clip. Must be of 32 bit float format. Each plane is denoised separately if chroma is set to False. Data of unprocessed planes is undefined. Frame properties of the output clip are copied from it.

  • ref:

    The reference clip. Must be of the same format, width, height, number of frames as clip.

    Used in block-matching and as the reference in empirical Wiener filtering, i.e. bm3d.Final / bm3d.VFinal:

    basic = core.{bm3dcpu, bm3dcuda, bm3dcuda_rtc}.BM3D(src, radius=0)
    final = core.{bm3d...}.BM3D(src, ref=basic, radius=0)
    
    vbasic = core.{bm3d...}.BM3D(src, radius=radius_nonzero).bm3d.VAggregate(radius=radius_nonzero)
    vfinal = core.{bm3d...}.BM3D(src, ref=vbasic, radius=r).bm3d.VAggregate(radius=r)
    
    # alternatively, using the v2 interface
    basic_or_vbasic = core.{bm3dcpu, bm3dcuda, bm3dcuda_rtc}.BM3Dv2(src, radius=r)
    final_or_vfinal = core.{bm3d...}.BM3Dv2(src, ref=basic_or_vbasic, radius=r)
    

    corresponds to the followings (ignoring color space handling and other differences in implementation), respectively

    basic = core.bm3d.Basic(clip)
    final = core.bm3d.Final(basic, ref=basic)
    
    vbasic = core.bm3d.VBasic(src, radius=r).bm3d.VAggregate(radius=r, sample=1)
    vfinal = core.bm3d.VFinal(src, ref=vbasic, radius=r).bm3d.VAggregate(radius=r)
    
  • sigma: The strength of denoising for each plane.

    The strength is similar (but not strictly equal) as VapourSynth-BM3D due to differences in implementation. (coefficient normalization is not implemented, for example)

    Default [3,3,3].

  • block_step, bm_range, radius, ps_num, ps_range:

    Same as those in VapourSynth-BM3D.

    If chroma is set to True, only the first value is in effect.

    Otherwise an array of values may be specified for each plane (except radius).

    Note: It is generally not recommended to take a large value of ps_num as current implementations do not take duplicate block-matching candidates into account during temporary searching, which may leads to regression in denoising quality. This issue is not present in VapourSynth-BM3D.

    Note2: Lowering the value of "block_step" will be useful in reducing blocking artifacts at the cost of slower processing.

  • chroma:

    CBM3D algorithm. clip must be of YUV444PS format.

    Y channel is used in block-matching of chroma channels.

    Default False.

  • device_id:

    Set GPU to be used.

    Default 0.

  • fast:

    Multi-threaded copy between CPU and GPU at the expense of 4x memory consumption.

    Default True.

  • extractor_exp:

    Used for deterministic (bitwise) output. This parameter is not present in the cpu version since the implementation always produces deterministic output.

    Pre-rounding is employed for associative floating-point summation.

    The value should be a positive integer not less than 3, and may need to be higher depending on the source video and filter parameters.

    Default 0. (non-determinism)

Notes

  • bm3d.VAggregate should be called after temporal filtering, as in VapourSynth-BM3D. Alternatively, you may use the BM3Dv2() interface for both spatial and temporal denoising in one step.

  • The _rtc version has three additional experimental parameters:

    • bm_error_s: (string)

      Specify cost for block similarity measurement.

      Currently implemented costs: SSD (Sum of Squared Differences), SAD (Sum of Absolute Differences), ZSSD (Zero-mean SSD), ZSAD (Zero-mean SAD), SSD/NORM.

      Default SSD.

    • transform_2d_s/transform_1d_s: (string)

      Specify type of transform.

      Currently implemented transforms: DCT (Discrete Cosine Transform), Haar (Haar Transform), WHT (Walsh–Hadamard Transform), Bior1.5 (transform based on a bi-orthogonal spline wavelet).

      Default DCT.

    These features are not implemented in the standard version due to performance and binary size concerns.

Statistics

GPU memory consumptions:

(ref ? 4 : 3) * (chroma ? 3 : 1) * (fast ? 4 : 1) * (2 * radius + 1) * size_of_a_single_frame

Compilation

  • The CMake configuration of BM3DCUDA_RTC links to NVRTC static library by default, which requires CUDA 11.5 or later.
cmake -S . -B build -D CMAKE_BUILD_TYPE=Release -D CMAKE_CUDA_FLAGS="--threads 0 --use_fast_math -Wno-deprecated-gpu-targets" -D CMAKE_CUDA_ARCHITECTURES="50;61-real;75-real;86"

cmake --build build --config Release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vapoursynth_bm3dcpu-2.16.tar.gz (625.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

vapoursynth_bm3dcpu-2.16-py3-none-win_amd64.whl (507.5 kB view details)

Uploaded Python 3Windows x86-64

vapoursynth_bm3dcpu-2.16-py3-none-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded Python 3musllinux: musl 1.2+ x86-64

vapoursynth_bm3dcpu-2.16-py3-none-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (70.3 kB view details)

Uploaded Python 3manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file vapoursynth_bm3dcpu-2.16.tar.gz.

File metadata

  • Download URL: vapoursynth_bm3dcpu-2.16.tar.gz
  • Upload date:
  • Size: 625.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vapoursynth_bm3dcpu-2.16.tar.gz
Algorithm Hash digest
SHA256 153087974fca43458f33e6dfbe723f524ab98467487bb50cb9fd55d394d86027
MD5 44041fe52a8dc0677dbbe86e6bd645bf
BLAKE2b-256 18d7bebb762b464fb087e62bf1f19c7e60804a93d51f186e962a0580baf8990f

See more details on using hashes here.

Provenance

The following attestation bundles were made for vapoursynth_bm3dcpu-2.16.tar.gz:

Publisher: publish.yml on Jaded-Encoding-Thaumaturgy/vs-wheels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vapoursynth_bm3dcpu-2.16-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for vapoursynth_bm3dcpu-2.16-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 6131b93b8cd1eeb28dff0bd286e49c451358d1f355b95dc7a70f3a2f05e98819
MD5 d00f05005f92c7ee1b6e791e5cacfbfe
BLAKE2b-256 6ef7b90595e6bc4594517b203da8fad81c5dd1396ee73c8a477aa640a77f1a62

See more details on using hashes here.

Provenance

The following attestation bundles were made for vapoursynth_bm3dcpu-2.16-py3-none-win_amd64.whl:

Publisher: publish.yml on Jaded-Encoding-Thaumaturgy/vs-wheels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vapoursynth_bm3dcpu-2.16-py3-none-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for vapoursynth_bm3dcpu-2.16-py3-none-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 83ba145aadd13e49da8c260c2899563d82fa1a218f2e8bae769658a3b6a4033e
MD5 53f266956026e6b104080cca9fa87f73
BLAKE2b-256 0a2b52a45f5abf89a026043ab5e7993c479576d3cdc8b3bbb3f5580728bc5c87

See more details on using hashes here.

Provenance

The following attestation bundles were made for vapoursynth_bm3dcpu-2.16-py3-none-musllinux_1_2_x86_64.whl:

Publisher: publish.yml on Jaded-Encoding-Thaumaturgy/vs-wheels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vapoursynth_bm3dcpu-2.16-py3-none-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for vapoursynth_bm3dcpu-2.16-py3-none-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bbadfa832acc52b503af3897aa6cee16a47c64ac590d05e0b29088a4cd0531d5
MD5 fd7ba345d35e9f983a8be8a563cab874
BLAKE2b-256 7ea2cd9164f96795814da124cb996bce22a1139b37142b0ec7344c1e284b908d

See more details on using hashes here.

Provenance

The following attestation bundles were made for vapoursynth_bm3dcpu-2.16-py3-none-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl:

Publisher: publish.yml on Jaded-Encoding-Thaumaturgy/vs-wheels

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page