StreamDiffusionV2 offline and online video diffusion inference.

These details have not been verified by PyPI

Project links

Project description

StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation (MLSys 2026)

Tianrui Feng¹, Zhi Li², Shuo Yang², Haocheng Xi², Muyang Li³, Xiuyu Li¹, Lvmin Zhang⁴, Keting Yang⁵, Kelly Peng⁶, Song Han⁷, Maneesh Agrawala⁴, Kurt Keutzer², Akio Kodaira⁸, Chenfeng Xu^†,1

¹UT Austin, ²UC Berkeley, ³Nunchaku AI, ⁴Stanford University, ⁵Independent Researcher, ⁶First Intelligence, ⁷MIT, ⁸Shizhuku AI

^† Project lead, corresponding to xuchenfeng@utexas.edu

Overview

StreamDiffusionV2 is an open-source interactive diffusion pipeline for real-time streaming applications. It scales across diverse GPU setups, supports flexible denoising steps, and delivers high FPS for creators and platforms. Further details are available on our project homepage.

News

[2026-03-27] StreamDiffusionV2 is now available on PyPI. Install the environment via pip install streamdiffusionv2.
[2026-03-27] Added optional TAEHV-VAE support for inference via --use_taehv and USE_TAEHV=1.
[2026-03-06] Update Ring-buffer KV Cache for efficient sliding window attention.
[2026-01-26] 🎉 StreamDiffusionV2 is accepted by MLSys 2026!
[2025-11-10] 🚀 We have released our paper at arXiv. Check it for more details!
[2025-10-18] Release our model checkpoint on huggingface.
[2025-10-06] 🔥 Our StreamDiffusionV2 is publicly released! Check our project homepage for more details.

Prerequisites

OS: Linux with NVIDIA GPU
CUDA-compatible GPU and drivers

Installation

conda create -n streamdiffusionv2 python=3.10 -y
conda activate streamdiffusionv2

# PyPI
pip install streamdiffusionv2

# Needed Blackwell GPUs
pip install torch==2.11.0 torchvision==0.26.0

# Optional but recommended for better throughput
pip install "streamdiffusionv2[flash-attn]"

If you are installing from a local checkout of this repository instead of PyPI:

conda create -n streamdiffusionv2 python=3.10
conda activate streamdiffusionv2
pip install .

# Optional but recommended for better throughput
pip install ".[flash-attn]"

The package install includes the Python dependencies required for both offline inference and the demo backend. The demo frontend still requires Node.js 18 as described in demo/README.md.

Download Checkpoints

# 1.3B Model
huggingface-cli download --resume-download Wan-AI/Wan2.1-T2V-1.3B --local-dir wan_models/Wan2.1-T2V-1.3B
huggingface-cli download --resume-download jerryfeng/StreamDiffusionV2 --local-dir ./ckpts --include "wan_causal_dmd_v2v/*"

# 14B Model
huggingface-cli download --resume-download Wan-AI/Wan2.1-T2V-14B --local-dir wan_models/Wan2.1-T2V-14B
huggingface-cli download --resume-download jerryfeng/StreamDiffusionV2 --local-dir ./ckpts --include "wan_causal_dmd_v2v_14b/*"

We use the 14B model from CausVid-Plus for offline inference demo.

Optional: TAEHV-VAE Checkpoint

If you want to enable the lightweight TAEHV decoder, download its checkpoint once:

curl -L https://github.com/madebyollin/taehv/raw/main/taew2_1.pth -o ckpts/taew2_1.pth

The offline inference code can also download this file automatically on first use, but keeping it in ckpts/taew2_1.pth avoids that extra startup step.

Usage Example

We provide a simple example of how to use StreamDiffusionV2. For more detailed examples, please refer to streamv2v directory.

Single GPU

import numpy as np

from streamdiffusionv2 import StreamDiffusionV2Pipeline, export_video, load_video

stream = StreamDiffusionV2Pipeline(
    checkpoint_folder="ckpts/wan_causal_dmd_v2v",
    mode="single",
)
stream.prepare("A dog walks on the grass, realistic")

video = load_video("examples/original.mp4", height=480, width=832)
decoded_chunks = []
noise_scale = stream.noise_scale

for video_chunk in stream.chunk_video(video):
    encoded_chunk = stream.encode_chunk(
        video,
        video_chunk,
        previous_noise_scale=noise_scale,
        initial_noise_scale=stream.noise_scale,
    )
    noise_scale = encoded_chunk.noise_scale
    denoised_chunk = stream.denoise_chunk(encoded_chunk)
    if denoised_chunk is None:
        continue
    decoded_chunks.append(stream.decode_chunk(denoised_chunk))

output = np.concatenate(decoded_chunks, axis=0)
export_video(output, "outputs/python_single.mp4", fps=16)

Single GPU Without Stream-Batch

import numpy as np

from streamdiffusionv2 import StreamDiffusionV2Pipeline, export_video, load_video

stream = StreamDiffusionV2Pipeline(
    checkpoint_folder="ckpts/wan_causal_dmd_v2v",
    mode="single-wo",
)
stream.prepare("A dog walks on the grass, realistic")

video = load_video("examples/original.mp4", height=480, width=832)
decoded_chunks = []
noise_scale = stream.noise_scale

for video_chunk in stream.chunk_video(video):
    encoded_chunk = stream.encode_chunk(
        video,
        video_chunk,
        previous_noise_scale=noise_scale,
        initial_noise_scale=stream.noise_scale,
    )
    noise_scale = encoded_chunk.noise_scale
    denoised_chunk = stream.denoise_chunk(encoded_chunk)
    if denoised_chunk is None:
        continue
    decoded_chunks.append(stream.decode_chunk(denoised_chunk))

output = np.concatenate(decoded_chunks, axis=0)
export_video(output, "outputs/python_single_wo.mp4", fps=16)

Multi-GPU Pipeline

Pipeline-parallel inference still launches multiple worker processes, so the Python API for that mode stays as one imported function:

from streamdiffusionv2 import run_video_to_video

run_video_to_video(
    mode="pipe",
    checkpoint_folder="ckpts/wan_causal_dmd_v2v",
    video_path="examples/original.mp4",
    prompt="A dog walks on the grass, realistic",
    output_path="outputs/python_pipe.mp4",
    gpu_ids=[0, 1],
    num_gpus=2,
)

Optional Acceleration

The staged API can be reconfigured before prepare(...):

from streamdiffusionv2 import StreamDiffusionV2Pipeline

stream = StreamDiffusionV2Pipeline(checkpoint_folder="ckpts/wan_causal_dmd_v2v")
stream.enable_acceleration(fast=True)
stream.prepare("A dog walks on the grass, realistic")

fast=True enables use_taehv and use_tensorrt, and it automatically switches the default config from wan_causal_dmd_v2v.yaml to wan_causal_dmd_v2v_fast.yaml.

Offline Inference

All offline inference entrypoints are unified under run_v2v.sh.

Choose one mode first:

single: single-GPU streaming inference
single-wo: single-GPU inference without Stream-batch
pipe: multi-GPU pipeline inference

Quick start:

./run_v2v.sh single
./run_v2v.sh single-wo
./run_v2v.sh pipe
./run_v2v.sh pipe --profile

Use --profile only when you want synchronized throughput measurements.

The legacy wrappers v2v.sh, v2v_wo.sh, and pipe_v2v.sh still work, but they now forward to the same shared entrypoint.

Common Arguments

The most important options are:

--config_path: model config YAML
--checkpoint_folder: checkpoint directory
--video_path: input video
--prompt_file_path: prompt text file
--output_folder: output directory
--height and --width: output resolution
--fps: target output FPS
--step: number of denoising steps used during inference
--use_taehv: use Wan stream encode with the TAEHV decoder for faster VAE decoding

You can pass overrides either as CLI flags or as environment variables. For example:

OUTPUT_FOLDER=outputs/run_single ./run_v2v.sh single
VIDEO_PATH=examples/original.mp4 PROMPT_FILE_PATH=examples/prompt.txt ./run_v2v.sh single-wo
NPROC_PER_NODE=2 MASTER_PORT=29511 ./run_v2v.sh pipe
./run_v2v.sh single --use_taehv

Single GPU

This is the standard offline path when you run on one GPU.

./run_v2v.sh single \
--config_path configs/wan_causal_dmd_v2v.yaml \
--checkpoint_folder ckpts/wan_causal_dmd_v2v \
--output_folder outputs/ \
--prompt_file_path examples/prompt.txt \
--video_path examples/original.mp4 \
--height 480 \
--width 832 \
--fps 16 \
--step 2

To enable the TAEHV decoder in this mode:

./run_v2v.sh single --use_taehv

Multi-GPU

Use this mode when you want to split inference across multiple GPUs.

./run_v2v.sh pipe \
--config_path configs/wan_causal_dmd_v2v.yaml \
--checkpoint_folder ckpts/wan_causal_dmd_v2v \
--output_folder outputs/ \
--prompt_file_path examples/prompt.txt \
--video_path examples/original.mp4 \
--height 480 \
--width 832 \
--fps 16 \
--step 2
# --schedule_block  # optional: enable block scheduling

To enable the TAEHV decoder in pipeline mode:

./run_v2v.sh pipe --use_taehv

Notes:

--schedule_block is optional and can improve throughput on some multi-GPU setups.
Adjust NPROC_PER_NODE, --height, --width, and --fps to match your hardware and target workload.
./run_v2v.sh pipe --profile is intended for profiling runs, not normal benchmarking or deployment.

Online Inference (Web UI)

A minimal web demo is available under demo/. For setup and startup, please refer to demo.

Access in a browser after startup: http://0.0.0.0:7860 or http://localhost:7860
To enable the TAEHV decoder in the web demo, start it with USE_TAEHV=1.

To-do List

Demo and inference pipeline.
Dynamic scheduler for various workload.
Training code.
FP8 support.
TensorRT support.

Acknowledgements

StreamDiffusionV2 is inspired by the prior works StreamDiffusion and StreamV2V. Our Causal DiT builds upon CausVid, and the rolling KV cache design is inspired by Self-Forcing.

We are grateful to the team members of StreamDiffusion for their support. We also thank First Intelligence and Daydream team for their great feedback.

We also especially thank DayDream team for the great collaboration and incorporating our StreamDiffusionV2 pipeline into their cool Demo UI.

Citation

If you find this repository useful in your research, please consider giving a star ⭐ or a citation.

@article{feng2025streamdiffusionv2,
  title={StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation},
  author={Feng, Tianrui and Li, Zhi and Yang, Shuo and Xi, Haocheng and Li, Muyang and Li, Xiuyu and Zhang, Lvmin and Yang, Keting and Peng, Kelly and Han, Song and others},
  journal={arXiv preprint arXiv:2511.07399},
  year={2025}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

May 18, 2026

0.1.0

Mar 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streamdiffusionv2-0.1.1.tar.gz (92.6 kB view details)

Uploaded May 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

streamdiffusionv2-0.1.1-py3-none-any.whl (106.0 kB view details)

Uploaded May 18, 2026 Python 3

File details

Details for the file streamdiffusionv2-0.1.1.tar.gz.

File metadata

Download URL: streamdiffusionv2-0.1.1.tar.gz
Upload date: May 18, 2026
Size: 92.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for streamdiffusionv2-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`d852a59a70946c06623fcdbaf2757021e27886706a7d341f7ceec2bd0cdd8d7f`
MD5	`728a3833760484a3e027f06be4771bf4`
BLAKE2b-256	`ae8312625c90d814271667534ba29b22e29bf221b350896cc29aa48f2f981bd1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for streamdiffusionv2-0.1.1.tar.gz:

Publisher: publish-to-pypi.yml on jerryfeng2003/StreamDiffusionV2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: streamdiffusionv2-0.1.1.tar.gz
- Subject digest: d852a59a70946c06623fcdbaf2757021e27886706a7d341f7ceec2bd0cdd8d7f
- Sigstore transparency entry: 1566711133
- Sigstore integration time: May 18, 2026
Source repository:
- Permalink: jerryfeng2003/StreamDiffusionV2@1ebc184c2759cd955f20016b6bf4f94bc45ea01a
- Branch / Tag: refs/heads/master
- Owner: https://github.com/jerryfeng2003
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@1ebc184c2759cd955f20016b6bf4f94bc45ea01a
- Trigger Event: workflow_dispatch

File details

Details for the file streamdiffusionv2-0.1.1-py3-none-any.whl.

File metadata

Download URL: streamdiffusionv2-0.1.1-py3-none-any.whl
Upload date: May 18, 2026
Size: 106.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for streamdiffusionv2-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8a04de9e0c44ed30d971d6aff849ce6e302338187a39a0f44cfe924df1e9c9c2`
MD5	`bf47bf6e8686cba56a594998f5b73f61`
BLAKE2b-256	`f75c6f006da31e82088c1ea1f75871556ecbb1eeb22e18113dfb0989fcecf332`

See more details on using hashes here.

Provenance

The following attestation bundles were made for streamdiffusionv2-0.1.1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on jerryfeng2003/StreamDiffusionV2

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: streamdiffusionv2-0.1.1-py3-none-any.whl
- Subject digest: 8a04de9e0c44ed30d971d6aff849ce6e302338187a39a0f44cfe924df1e9c9c2
- Sigstore transparency entry: 1566711223
- Sigstore integration time: May 18, 2026
Source repository:
- Permalink: jerryfeng2003/StreamDiffusionV2@1ebc184c2759cd955f20016b6bf4f94bc45ea01a
- Branch / Tag: refs/heads/master
- Owner: https://github.com/jerryfeng2003
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@1ebc184c2759cd955f20016b6bf4f94bc45ea01a
- Trigger Event: workflow_dispatch

streamdiffusionv2 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation (MLSys 2026)

Overview

News

Prerequisites

Installation

Download Checkpoints

Optional: TAEHV-VAE Checkpoint

Usage Example

Single GPU

Single GPU Without Stream-Batch

Multi-GPU Pipeline

Optional Acceleration

Offline Inference

Common Arguments

Single GPU

Multi-GPU

Online Inference (Web UI)

To-do List

Acknowledgements

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance