Skip to main content

Unified CUDA kernels for FastVideo

Project description

FastVideo Kernel

CUDA kernels for FastVideo video generation.

Installation

Standard Installation (Local Development)

This will automatically detect your GPU architecture. If an NVIDIA Hopper (H100/sm_90a) GPU is detected, ThunderKittens kernels will be enabled. Otherwise, they will be skipped, and the package will use Triton fallbacks at runtime.

git submodule update --init --recursive
cd fastvideo-kernel
./build.sh

Rocm Build

If you are in a rocm environment without the compilation toolchaine of CUDA.

cd fastvideo-kernel
./build.sh --rocm

Usage

Sliding Tile Attention (STA) & Video Sparse Attention (VSA)

For detailed usage, please check the Attention Documentation.

from fastvideo_kernel import sliding_tile_attention, video_sparse_attn, moba_attn_varlen

# Example: Sliding Tile Attention
out = sliding_tile_attention(q, k, v, window_sizes, text_len)

# Example: Video Sparse Attention (with Triton fallback)
out = video_sparse_attn(q, k, v, block_sizes, block_sizes, topk=5)

# Example: VMoBA
out = moba_attn_varlen(q, k, v, cu_seqlens_q, cu_seqlens_k, ...)

TurboDiffusion Kernels

This package also includes kernels from TurboDiffusion, including INT8 GEMM, Quantization, RMSNorm and LayerNorm.

Requirements

  • Runtime:
    • NVIDIA H100 (sm_90a) for C++ optimized kernels.
    • Any CUDA GPU for Triton-based fallbacks.
  • Build:
    • CUDA Toolkit 12.3+
    • C++20 compatible compiler (GCC 10+, Clang 11+)

Acknowledgement

This package structure and build system are based on sgl-kernel from the SGLang project.

The implementation of turbodiffusion kernels is adapted from TurboDiffusion. If you use these kernels, please cite:

@article{zhang2025turbodiffusion,
  title={TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times},
  author={Zhang, Jintao and Zheng, Kaiwen and Jiang, Kai and Wang, Haoxu and Stoica, Ion and Gonzalez, Joseph E and Chen, Jianfei and Zhu, Jun},
  journal={arXiv preprint arXiv:2512.16093},
  year={2025}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastvideo_kernel-0.2.3.tar.gz (38.5 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fastvideo_kernel-0.2.3-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (13.4 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

fastvideo_kernel-0.2.3-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (13.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

fastvideo_kernel-0.2.3-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl (13.4 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64manylinux: glibc 2.35+ x86-64

File details

Details for the file fastvideo_kernel-0.2.3.tar.gz.

File metadata

  • Download URL: fastvideo_kernel-0.2.3.tar.gz
  • Upload date:
  • Size: 38.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fastvideo_kernel-0.2.3.tar.gz
Algorithm Hash digest
SHA256 5fc4e33fccd6719419d42b164d984a07c718c1c8c55096938f33e90001112cc2
MD5 d163fd28cecd214278ccbc9084ca9361
BLAKE2b-256 ce990bded7b8525ca3751ee84a00cf00d75e698d4c07c9ebc4a62f1b7391a9a1

See more details on using hashes here.

Provenance

The following attestation bundles were made for fastvideo_kernel-0.2.3.tar.gz:

Publisher: fastvideo-kernel-publish.yml on hao-ai-lab/FastVideo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fastvideo_kernel-0.2.3-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for fastvideo_kernel-0.2.3-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 0d37160efa7a9a483e22772487fb2c44deb09fe91bba34a855759b984504a258
MD5 3a561c42511904289e7b0029323e2f19
BLAKE2b-256 963b5706e7490ee3290cfebb7106838e769f1b52ae1aba2df30ad7729b01a4d4

See more details on using hashes here.

Provenance

The following attestation bundles were made for fastvideo_kernel-0.2.3-cp312-cp312-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: fastvideo-kernel-publish.yml on hao-ai-lab/FastVideo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fastvideo_kernel-0.2.3-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for fastvideo_kernel-0.2.3-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 8f9d23b00f3abda0b582b2c37c2f4ce8c6faaacad83ef26a56068543c0d7bc33
MD5 95146490e28985eccbe7b721fa945b2a
BLAKE2b-256 6440df31095581528646bb38e12fc56e11e202e8821ad429d8042611aeb8fe9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for fastvideo_kernel-0.2.3-cp311-cp311-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: fastvideo-kernel-publish.yml on hao-ai-lab/FastVideo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fastvideo_kernel-0.2.3-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl.

File metadata

File hashes

Hashes for fastvideo_kernel-0.2.3-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl
Algorithm Hash digest
SHA256 076ffc108212908813b53519d42e63725920516cd4b1fda5550b94be748268ba
MD5 1e6058b5153816f245ca71532c01bcbb
BLAKE2b-256 9f298af9d7f52e84cb9ffaee0da9ec87eea3b110136356a1a31ecda7af2dd2ca

See more details on using hashes here.

Provenance

The following attestation bundles were made for fastvideo_kernel-0.2.3-cp310-cp310-manylinux_2_34_x86_64.manylinux_2_35_x86_64.whl:

Publisher: fastvideo-kernel-publish.yml on hao-ai-lab/FastVideo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page