Skip to main content

Optimized CUDA implementation of differentiable SSIM for PyTorch

Project description

fussim

PyPI Python

Fast CUDA SSIM for PyTorch — pip install fussim with pre-built wheels.

~7x faster than pytorch-msssim | FP16/AMP support | Drop-in replacement

Based on MrNeRF/optimized-fused-ssim.


Installation

pip install fussim

Works out of the box for Linux (PyTorch 2.9 + CUDA 12.8) and Windows (PyTorch 2.8 + CUDA 12.8).

Other PyTorch/CUDA versions

Pre-built wheels for PyTorch 2.5–2.9 and CUDA 11.8–12.8:

pip install fussim --extra-index-url https://opsiclear.github.io/fussim/whl/

This auto-selects the correct wheel for your installed PyTorch version.

PyTorch Python CUDA 11.8 CUDA 12.1 CUDA 12.4 CUDA 12.6 CUDA 12.8
2.5.1 3.10–3.12
2.6.0 3.10–3.12
2.7.1 3.10–3.12
2.8.0 3.10–3.13
2.9.0 3.10–3.13 ✓* ✓*

*Linux only. Windows has a known PyTorch bug.

→ Open Installation Configurator

Build from source

Requires CUDA Toolkit and C++ compiler.

git clone https://github.com/OpsiClear/fussim.git && cd fussim
pip install .

# For specific GPU architecture:
TORCH_CUDA_ARCH_LIST="8.9" pip install .  # RTX 4090
Other platforms

For macOS, older CUDA versions, or unsupported configurations, use the original fused-ssim which builds from source.


Quick Start

import torch
from fussim import fused_ssim

img1 = torch.rand(1, 3, 256, 256, device="cuda", requires_grad=True)
img2 = torch.rand(1, 3, 256, 256, device="cuda")

# Compute SSIM
ssim_value = fused_ssim(img1, img2)

# Use as loss
loss = 1.0 - fused_ssim(img1, img2)
loss.backward()

FP16 / AMP:

with torch.autocast(device_type="cuda"):
    ssim_value = fused_ssim(img1, img2)  # Uses FP16 kernel automatically

API

fused_ssim

fused_ssim(img1, img2, padding="same", train=True, window_size=11) -> Tensor
Parameter Type Default Description
img1 Tensor First image (B, C, H, W). Receives gradients.
img2 Tensor Second image (B, C, H, W)
padding str "same" "same" or "valid"
train bool True Enable gradient computation
window_size int 11 Gaussian window: 7, 9, or 11

Returns: Scalar mean SSIM value.

ssim (pytorch-msssim compatible)

ssim(X, Y, data_range=255, size_average=True, win_size=11, K=(0.01, 0.03), nonnegative_ssim=False) -> Tensor
Parameter Type Default Description
X, Y Tensor Images (B, C, H, W). Gradients computed for X.
data_range float 255 Value range (255 for uint8, 1.0 for normalized)
size_average bool True Return scalar mean or per-batch values
win_size int 11 7, 9, or 11
K tuple (0.01, 0.03) SSIM constants (K1, K2)
nonnegative_ssim bool False Clamp negative values to 0

SSIM Module

from fussim import SSIM

module = SSIM(data_range=1.0)
loss = 1 - module(pred, target)
loss.backward()

Performance

RTX 4090, batch 5×5×1080×1920, 100 iterations:

Implementation Forward Backward Total Speedup
pytorch-msssim 28.7 ms 28.9 ms 57.5 ms 1.0×
fussim 4.38 ms 4.66 ms 9.04 ms 6.4×

Limitations

Constraint Reason
window_size: 7, 9, or 11 CUDA kernel templates
win_sigma: 1.5 (fixed) Hardcoded in kernel
Custom win not supported Uses built-in Gaussian

Attribution

Citation

@software{optimized-fused-ssim,
    author = {Janusch Patas},
    title = {Optimized Fused-SSIM},
    year = {2025},
    url = {https://github.com/MrNeRF/optimized-fused-ssim},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fussim-0.2.0.tar.gz (36.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fussim-0.2.0-cp313-cp313-win_amd64.whl (394.4 kB view details)

Uploaded CPython 3.13Windows x86-64

fussim-0.2.0-cp313-cp313-manylinux_2_28_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

fussim-0.2.0-cp312-cp312-win_amd64.whl (394.4 kB view details)

Uploaded CPython 3.12Windows x86-64

fussim-0.2.0-cp312-cp312-manylinux_2_28_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

fussim-0.2.0-cp311-cp311-win_amd64.whl (394.3 kB view details)

Uploaded CPython 3.11Windows x86-64

fussim-0.2.0-cp311-cp311-manylinux_2_28_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

fussim-0.2.0-cp310-cp310-win_amd64.whl (393.6 kB view details)

Uploaded CPython 3.10Windows x86-64

fussim-0.2.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file fussim-0.2.0.tar.gz.

File metadata

  • Download URL: fussim-0.2.0.tar.gz
  • Upload date:
  • Size: 36.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fussim-0.2.0.tar.gz
Algorithm Hash digest
SHA256 52c84b6ced196f83a0e9fa0a3a885a704c4ddaf082258be66d02763b9382e6ea
MD5 13abf06e4ddf03a4ef91165b70d47774
BLAKE2b-256 ed6911c6ace67b912293ac16af774cb1fac89f7795f27df0ae4f938658e87a4c

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0.tar.gz:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fussim-0.2.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: fussim-0.2.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 394.4 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fussim-0.2.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 ecbace487ab91716bcf31aa94f2c7b42f2951f22ca12212df2ea40463b64c4e2
MD5 c7b8a8b274c30b33dc03c102aacea458
BLAKE2b-256 7744bf7e59707883055028a66bd91e1f362fad543d3e31d327c09844356f6a07

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fussim-0.2.0-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for fussim-0.2.0-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ed43bd209bfb2667f923c46e512aaed06e54226e5a7e04498207ed1c6fec14b6
MD5 215341c65e44ee047c04b219ed4ab4f5
BLAKE2b-256 14d56025056b060b07105a08989730592bfed5072019dc2d1bd26c5a9d3e731b

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0-cp313-cp313-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fussim-0.2.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: fussim-0.2.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 394.4 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fussim-0.2.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 d3f62a9fbaeabb4408682082cdc9b24a3c356fa2a5dbfb705116b1f602259aa5
MD5 5fc2880b74907ca7ae9c0fb6ea46d2cd
BLAKE2b-256 083418c25ecbec315ff4b9e2d1d5c37dd32d728cb143c363890cc370768ad556

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fussim-0.2.0-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for fussim-0.2.0-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 75050f5a8282ae11908c860a51332f1e610b8a8d29b714d2fe8eebaae7c200eb
MD5 9a6221657e8d8787136c5ba31607238c
BLAKE2b-256 80c2dc2a0ec576dbd9bfa81d2fb34ec5d4749eaa56e72d2e049d8e9ab108581b

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0-cp312-cp312-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fussim-0.2.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: fussim-0.2.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 394.3 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fussim-0.2.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 b503e0698039bef4ce0f3da71559cc0fba8f6c418884cda0eb2df592db46c46c
MD5 1a14cd6bfb2beb970dcbf8f83504c913
BLAKE2b-256 3f3ce13d7745786fc491f2b7a4386235ee7e2782cf86329d49d483fcc3802ac5

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fussim-0.2.0-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for fussim-0.2.0-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 882b370dd98a735edbc2064b812c5bbd916f05edb64f0b9201efc7b0c701a1c0
MD5 5120fcbbdc8f069f75b8eed4f3bcd1db
BLAKE2b-256 d11954ef2874bde727f5a4b4c4d35c1aa06d01f5d19f29c04cfd433957f36c27

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0-cp311-cp311-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fussim-0.2.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: fussim-0.2.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 393.6 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fussim-0.2.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 ba35b4d236b8abb504c4874fe38269232fae456da9124ce37c5751288be1d2d7
MD5 bfcf3ea02486687a02963326895e64d9
BLAKE2b-256 79aa59692190026fb2d0f0ab977e59c0daff7b6627cc1fb5a666fc266b0c84be

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fussim-0.2.0-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for fussim-0.2.0-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2d4c553ce9e3f09fafda0c4a11cb0e6409704189ed48ceb17ae4e3f87077fd18
MD5 21bd5b1e60efbab68afc9a60440fb1da
BLAKE2b-256 da4d4d72aa318f1a0ad9ca1c4950f69d94ace82adb360c2671cda7b97310e765

See more details on using hashes here.

Provenance

The following attestation bundles were made for fussim-0.2.0-cp310-cp310-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on OpsiClear/fussim

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page