Skip to main content

Pure-Python NVIDIA Warp backend for 3D Gaussian Splatting

Project description

gswarp

Pure-Python NVIDIA Warp backend for 3D Gaussian Splatting.

gswarp reimplements the three core CUDA modules used in 3DGS training — the rasterizer, SSIM loss, and KNN initialization — using NVIDIA Warp. No C++/CUDA compilation required; install via pip and drop it in as a replacement for the original CUDA packages.

Full documentation and source: https://github.com/fancifulland2718/gswarp


Three Replacement Modules

Module Replaces Import Path
Rasterizer diff_gaussian_rasterization gswarp
SSIM fused_ssim gswarp.fused_ssim
KNN simple_knn gswarp.knn

Requirements

  • Python 3.10+
  • NVIDIA GPU with compute capability ≥ 7.0 (Volta)
  • PyTorch 1.13+ (with CUDA support)
  • NVIDIA Warp 1.8.0+

Installation

pip install gswarp

warp-lang is installed automatically as a dependency. No compilation steps needed after installation.


Quick Start

Standard Rasterizer

Replace the original import in gaussian_renderer/__init__.py:

# before
from diff_gaussian_rasterization import GaussianRasterizationSettings, GaussianRasterizer

# after
from gswarp import GaussianRasterizationSettings, GaussianRasterizer

GaussianRasterizer.forward() returns a three-element tuple (color, radii, meta). The meta object is a RasterizerMeta NamedTuple with five auxiliary fields:

color, radii, meta = rasterizer(means3D=..., means2D=..., ...)

# meta fields:
#   meta.depth          (1, H, W)  — per-pixel accumulated depth
#   meta.alpha          (1, H, W)  — per-pixel accumulated opacity
#   meta.proj_2D        (N, 2)     — 2D projected positions
#   meta.conic_2D       (N, 3)     — 2D conic representation
#   meta.conic_2D_inv   (N, 3)     — inverse 2D conic

# If you only need color and radii:
color, radii, _ = rasterizer(means3D=..., means2D=..., ...)

Optional runtime tuning (call once before the training loop):

from gswarp import initialize_runtime_tuning, set_binning_sort_mode

initialize_runtime_tuning(device="cuda:0", verbose=True)
set_binning_sort_mode("warp_depth_stable_tile")  # recommended for large scenes

Flow Backend (optical-flow pipelines)

A dedicated backend adds per-pixel top-K contributor tracking:

from gswarp.rasterizer_flow import GaussianRasterizationSettings, GaussianRasterizer

raster_settings = GaussianRasterizationSettings(
    ...,
    enable_flow_grad=True,   # default True — enables flow gradients
    compute_flow_aux=True,   # default None (runtime True) — fills top-K outputs
)

color, radii, depth, alpha, proj_2D, conic_2D, conic_2D_inv, \
    gs_per_pixel, weight_per_gs_pixel, x_mu = rasterizer(...)
Aux output Shape dtype Meaning
gs_per_pixel (K, H, W) int32 Index of each top-K Gaussian per pixel (-1 = empty)
weight_per_gs_pixel (K, H, W) float32 Alpha-compositing weight
x_mu (2, K, H, W) float32 Offset (dx, dy) from projected center to pixel

K defaults to 20; adjust with set_flow_topk(k) from gswarp.rasterizer_flow.

SSIM and KNN

# SSIM — identical signature to fused_ssim
from gswarp.fused_ssim import fused_ssim
loss_ssim = fused_ssim(img1, img2, padding="same", train=True)

# KNN — identical signature to simple_knn
from gswarp.knn import distCUDA2
dist2 = distCUDA2(points)  # points: (N, 3) float32 CUDA tensor

Performance

Full 30K-step training, RTX 5090D V2 (sm_120), PyTorch 2.11.0+cu130, Warp 1.12.0. All three modules use the Warp backend.

NeRF Synthetic (8 scenes): average ×1.08 speedup over CUDA baseline.
Tanks & Temples / Deep Blending (4 large scenes): average ×1.03 speedup.

Sample results:

Dataset CUDA (it/s) Warp (it/s) Speedup
chair 103.6 113.1 ×1.09
drums 103.0 115.3 ×1.12
hotdog 144.5 156.5 ×1.08
mic 95.4 105.1 ×1.10
truck 39.4 40.1 ×1.02
drjohnson 30.8 32.0 ×1.04

Quality

NeRF Synthetic 8-scene average after 30K steps:

Metric CUDA Warp Δ
PSNR (dB) 33.31 33.33 +0.02
SSIM 0.9692 0.9693 +0.0001
LPIPS 0.0303 0.0302 −0.0001

Per-scene PSNR differences are within ±0.25 dB. Training quality is equivalent to the CUDA baseline across all tested scenes.


Links


Acknowledgements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gswarp-1.0.5.tar.gz (103.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gswarp-1.0.5-py3-none-any.whl (101.8 kB view details)

Uploaded Python 3

File details

Details for the file gswarp-1.0.5.tar.gz.

File metadata

  • Download URL: gswarp-1.0.5.tar.gz
  • Upload date:
  • Size: 103.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gswarp-1.0.5.tar.gz
Algorithm Hash digest
SHA256 4a1cce055b60cb48a9d2f7fdcf8da17675cc914c0727fab69fe36251e6a584c3
MD5 81cc8c3894e0d622ba4690a4ba80cd04
BLAKE2b-256 8cfc8931594198632bb6c95140f9dcb14372d66c37f7b111423ef7de61eadd55

See more details on using hashes here.

Provenance

The following attestation bundles were made for gswarp-1.0.5.tar.gz:

Publisher: python-publish.yml on fancifulland2718/gswarp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gswarp-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: gswarp-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 101.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gswarp-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 785c1d722820dd8e38370ac65102c76188d7c3beec4a07fe65783559ef7294aa
MD5 677d1d33cb83a251fd6f601fb4b5d596
BLAKE2b-256 86bc632130b3b370f2fe606e78ab9fb0f60649155e7dc837240dcbdb4b8d4994

See more details on using hashes here.

Provenance

The following attestation bundles were made for gswarp-1.0.5-py3-none-any.whl:

Publisher: python-publish.yml on fancifulland2718/gswarp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page