CUDA global sort + CUDA-GL interop + geometry shader quad emitting + hardware rasterization + CUDA-GL interop = fast gaussian splatting
Project description
Fast Gaussian Splatting
- 5-10x faster rendering than the original software CUDA rasterizer (diff-gaussian-rasterization).
- 2-3x faster if using offline rendering. (Bottleneck: copying rendered images around, thinking about improvements.)
No backward pass is supported yet. Will think of ways to add a backward. Depth-peeling (4K4D) is too slow.
Installation
No CUDA compilation is required.
pip install fast_gauss
Usage
Replace the original import of diff_gaussian_rasterization
with fast_gauss
.
For example, replace this:
from diff_gaussian_rasterization import GaussianRasterizationSettings, GaussianRasterizer
with this:
from fast_gauss import GaussianRasterizationSettings, GaussianRasterizer
And you're good to go.
Tips
Note that the second output of the GaussianRasterizer
is not radii anymore (since we're not gonna use it for the backward pass), but the alpha values of the rendered image instead.
And the alpha channel content seems to be bugged currently, will debug.
- TODO: Debug alpha channel
It's also recommended to pass in a CPU tensor in the GaussianRasterizationSettings to avoid explicit synchronizations for even better performance.
Note: for the ultimate 5-10x performance increase, you'll need to let fast_gauss
's shader directly write to your desired framebuffer.
Currently, we will try to automatically detect whether you're managing your own OpenGL context (i.e. opening up a GUI) by checking for the OpenGL
during the import of fast_gauss
.
If detected, all rendering command will return None
s and we will directly write to the bound framebuffer at the time of the draw call.
- TODO: Improve offline rendering performance.
TODOs
- TODO: Thinks of ways for backward pass
- TODO: Compute covariance from scaling and rotation in the shader, currently it's on the CUDA side.
- TODO: Compute SH in the shader, currently it's on the CUDA side.
Environment
This project requires you to have an NVIDIA GPU with the ability to interop between CUDA and OpenGL. Thus, WSL is not supported and OSX (MacOS) is not supported.
For offline rendering (the drop-in replacement of the original CUDA rasterizer), we also need a valid EGL environment. It can sometimes be hard to set up for virtualized machines. Potential fix.
- TODO: Test on more platforms.
Credits
Inspired by those insanely fast WebGL-based 3DGS viewers:
- GaussianSplats3D for inspiring our vertex-geometry-fragment shader pipeline.
- gsplat.tech.
- splat.
Using the algorithm and improvements from:
- diff-gaussian-rasterization for the main Gaussian Splatting algorithm.
- diff_gauss for the fixed culling.
CUDA-GL interop & EGL environment inspired by:
- 4K4D where they(I) used the interop for depth-peeling.
- EasyVolcap for the collection of utilities, including EGL setup.
- nvdiffrast for their EGL context setup and CUDA-GL interop setup.
Citation
@misc{fast_gauss,
title = {Fast Gaussian Splatting},
howpublished = {GitHub},
year = {2024},
url = {https://github.com/dendenxu/fast-gaussian-rasterization}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for fast_gauss-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7caeb3eacff63842a1a991e747cf869f19b67e6a73c9888ecbfc74324b6d8c00 |
|
MD5 | 16dc8dc68b40082aa0d5f4082f9a3f12 |
|
BLAKE2b-256 | b7ceb23b8a8fe55ba4f2d3f9347404573ef88c474468b662039572fa4c5a9108 |