Skip to main content

A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗Diffusers.

Project description

A Unified and Flexible Inference Engine with 🤗🎉
Hybrid Cache Acceleration and Parallelism for DiTs
Featured|HelloGitHub

🔥Hightlight

We are excited to announce that the first API-stable version (v1.0.0) of cache-dit has finally been released! cache-dit is a Unified and Flexible inference engine for 🤗 Diffusers, enabling acceleration with just ♥️one line♥️ of code. Key features: Unified Cache APIs, Forward Pattern Matching, Automatic Block Adapter, DBCache, DBPrune, Hybrid TaylorSeer Calibrator, Hybrid Cache CFG, Context Parallelism, Tensor Parallelism, Torch Compile Compatible and 🎉SOTA performance.

pip3 install -U cache-dit # pip3 install git+https://github.com/vipshop/cache-dit.git

You can install the stable release of cache-dit from PyPI, or the latest development version from GitHub. Then try ♥️ Cache Acceleration with just one line of code ~ ♥️

>>> import cache_dit
>>> from diffusers import DiffusionPipeline
>>> pipe = DiffusionPipeline.from_pretrained("Qwen/Qwen-Image") # Can be any diffusion pipeline
>>> cache_dit.enable_cache(pipe) # One-line code with default cache options.
>>> output = pipe(...) # Just call the pipe as normal.
>>> stats = cache_dit.summary(pipe) # Then, get the summary of cache acceleration stats.
>>> cache_dit.disable_cache(pipe) # Disable cache and run original pipe.

📚Core Features

  • 🎉Full 🤗Diffusers Support: Notably, cache-dit now supports nearly all of Diffusers' DiT-based pipelines, include 30+ series, nearly 100+ pipelines, such as FLUX.1, Qwen-Image, Qwen-Image-Lightning, Wan 2.1/2.2, HunyuanImage-2.1, HunyuanVideo, HiDream, AuraFlow, CogView3Plus, CogView4, CogVideoX, LTXVideo, ConsisID, SkyReelsV2, VisualCloze, PixArt, Chroma, Mochi, SD 3.5, DiT-XL, etc.
  • 🎉Extremely Easy to Use: In most cases, you only need one line of code: cache_dit.enable_cache(...). After calling this API, just use the pipeline as normal.
  • 🎉Easy New Model Integration: Features like Unified Cache APIs, Forward Pattern Matching, Automatic Block Adapter, Hybrid Forward Pattern, and Patch Functor make it highly functional and flexible. For example, we achieved 🎉 Day 1 support for HunyuanImage-2.1 with 1.7x speedup w/o precision loss—even before it was available in the Diffusers library.
  • 🎉State-of-the-Art Performance: Compared with algorithms including Δ-DiT, Chipmunk, FORA, DuCa, TaylorSeer and FoCa, cache-dit achieved the SOTA performance w/ 7.4x↑🎉 speedup on ClipScore!
  • 🎉Support for 4/8-Steps Distilled Models: Surprisingly, cache-dit's DBCache works for extremely few-step distilled models—something many other methods fail to do.
  • 🎉Compatibility with Other Optimizations: Designed to work seamlessly with torch.compile, Quantization (torchao, 🔥nunchaku), CPU or Sequential Offloading, 🔥Context Parallelism, 🔥Tensor Parallelism, etc.
  • 🎉Hybrid Cache Acceleration: Now supports hybrid Block-wise Cache + Calibrator schemes (e.g., DBCache or DBPrune + TaylorSeerCalibrator). DBCache or DBPrune acts as the Indicator to decide when to cache, while the Calibrator decides how to cache. More mainstream cache acceleration algorithms (e.g., FoCa) will be supported in the future, along with additional benchmarks—stay tuned for updates!
  • 🤗Diffusers Ecosystem Integration: 🔥cache-dit has joined the Diffusers community ecosystem as the first DiT-specific cache acceleration framework! Check out the documentation here:

🔥 Supported DiTs

[!Tip] One Model Series may contain many pipelines. cache-dit applies optimizations at the Transformer level; thus, any pipelines that include the supported transformer are already supported by cache-dit. ✅: known work and official supported now; ✖️: unofficial supported now, but maybe support in the future; 4-bits: w/ nunchaku + svdq int4.

📚Model Cache CP TP 📚Model Cache CP TP
🎉FLUX.1 🎉FLUX.1 4-bits ✖️
🎉Qwen-Image 🎉Qwen-Image 4-bits ✖️
🎉Qwen...Lightning 🎉Qwen...Lightning 4-bits ✖️
🎉CogVideoX ✖️ 🎉OmniGen ✖️ ✖️
🎉Wan 2.1 🎉PixArt ✖️
🎉Wan 2.2 🎉CogVideoX 1.5 ✖️
🎉HunyuanVideo 🎉Sana ✖️ ✖️
🎉LTX ✖️ 🎉VisualCloze
🎉Allegro ✖️ ✖️ 🎉AuraFlow ✖️ ✖️
🎉CogView4 ✖️ 🎉ShapE ✖️ ✖️
🎉CogView3Plus ✖️ 🎉Chroma ️✅
🎉Cosmos ✖️ ✖️ 🎉HiDream ✖️ ✖️
🎉EasyAnimate ✖️ ✖️ 🎉HunyuanDiT ✖️ ✖️
🎉SkyReelsV2 ✖️ ✖️ 🎉HunyuanDiTPAG ✖️ ✖️
🎉StableDiffusion3 ✖️ ✖️ 🎉Kandinsky5 ✖️ ✅️
🎉ConsisID ✖️ 🎉PRX ✖️ ✖️
🎉DiT ✖️ 🎉HunyuanImage
🎉Amused ✖️ ✖️ 🎉LongCatVideo ✖️ ✖️
🎉StableAudio ✖️ ✖️ 🎉Bria ✖️ ✖️
🎉Mochi ✖️ ✖️ 🎉Lumina ✖️ ✖️
🔥Click here to show many Image/Video cases🔥

🎉Now, cache-dit covers almost All Diffusers' DiT Pipelines🎉
🔥Qwen-Image | Qwen-Image-Edit | Qwen-Image-Edit-Plus 🔥
🔥FLUX.1 | Qwen-Image-Lightning 4/8 Steps | Wan 2.1 | Wan 2.2 🔥
🔥HunyuanImage-2.1 | HunyuanVideo | HunyuanDiT | HiDream | AuraFlow🔥
🔥CogView3Plus | CogView4 | LTXVideo | CogVideoX | CogVideoX 1.5 | ConsisID🔥
🔥Cosmos | SkyReelsV2 | VisualCloze | OmniGen 1/2 | Lumina 1/2 | PixArt🔥
🔥Chroma | Sana | Allegro | Mochi | SD 3/3.5 | Amused | ... | DiT-XL🔥

🔥Wan2.2 MoE | +cache-dit:2.0x↑🎉 | HunyuanVideo | +cache-dit:2.1x↑🎉

🔥Qwen-Image | +cache-dit:1.8x↑🎉 | FLUX.1-dev | +cache-dit:2.1x↑🎉

🔥Qwen...Lightning | +cache-dit:1.14x↑🎉 | HunyuanImage | +cache-dit:1.7x↑🎉

🔥Qwen-Image-Edit | Input w/o Edit | Baseline | +cache-dit:1.6x↑🎉 | 1.9x↑🎉

🔥FLUX-Kontext-dev | Baseline | +cache-dit:1.3x↑🎉 | 1.7x↑🎉 | 2.0x↑ 🎉

🔥HiDream-I1 | +cache-dit:1.9x↑🎉 | CogView4 | +cache-dit:1.4x↑🎉 | 1.7x↑🎉

🔥CogView3 | +cache-dit:1.5x↑🎉 | 2.0x↑🎉| Chroma1-HD | +cache-dit:1.9x↑🎉

🔥Mochi-1-preview | +cache-dit:1.8x↑🎉 | SkyReelsV2 | +cache-dit:1.6x↑🎉

🔥VisualCloze-512 | Model | Cloth | Baseline | +cache-dit:1.4x↑🎉 | 1.7x↑🎉

🔥LTX-Video-0.9.7 | +cache-dit:1.7x↑🎉 | CogVideoX1.5 | +cache-dit:2.0x↑🎉

🔥OmniGen-v1 | +cache-dit:1.5x↑🎉 | 3.3x↑🎉 | Lumina2 | +cache-dit:1.9x↑🎉

🔥Allegro | +cache-dit:1.36x↑🎉 | AuraFlow-v0.3 | +cache-dit:2.27x↑🎉

🔥Sana | +cache-dit:1.3x↑🎉 | 1.6x↑🎉| PixArt-Sigma | +cache-dit:2.3x↑🎉

🔥PixArt-Alpha | +cache-dit:1.6x↑🎉 | 1.8x↑🎉| SD 3.5 | +cache-dit:2.5x↑🎉

🔥Asumed | +cache-dit:1.1x↑🎉 | 1.2x↑🎉 | DiT-XL-256 | +cache-dit:1.8x↑🎉
♥️ Please consider to leave a ⭐️ Star to support us ~ ♥️

📖Table of Contents

For more advanced features such as Unified Cache APIs, Forward Pattern Matching, Automatic Block Adapter, Hybrid Forward Pattern, Patch Functor, DBCache, DBPrune, TaylorSeer Calibrator, Hybrid Cache CFG, Context Parallelism and Tensor Parallelism, please refer to the 🎉User_Guide.md for details.

👋Contribute

How to contribute? Star ⭐️ this repo to support us or check CONTRIBUTE.md.

🎉Projects Using CacheDiT

Here is a curated list of open-source projects integrating CacheDiT, including popular repositories like jetson-containers , flux-fast , and sdnext . 🎉CacheDiT has been recommended by: Wan2.2 , Qwen-Image-Lightning , Qwen-Image , LongCat-Video , Kandinsky-5 , , Featured|HelloGitHub , among others.

©️Acknowledgements

Special thanks to vipshop's Computer Vision AI Team for supporting document, testing and production-level deployment of this project.

©️Citations

@misc{cache-dit@2025,
  title={cache-dit: A Unified and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for Diffusers.},
  url={https://github.com/vipshop/cache-dit.git},
  note={Open-source software available at https://github.com/vipshop/cache-dit.git},
  author={DefTruth, vipshop.com},
  year={2025}
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cache_dit-1.0.13-py3-none-any.whl (162.0 kB view details)

Uploaded Python 3

File details

Details for the file cache_dit-1.0.13-py3-none-any.whl.

File metadata

  • Download URL: cache_dit-1.0.13-py3-none-any.whl
  • Upload date:
  • Size: 162.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for cache_dit-1.0.13-py3-none-any.whl
Algorithm Hash digest
SHA256 6a0855ba2bd94b3d7b30c1d63798266ad2b0be19259841039eb3cc88d340bb8e
MD5 fcd4789ceac11c3b1f93521039c0eb1a
BLAKE2b-256 96e8547f113f17d62fcb5cbc1081dd818042665c9661f23bd5c129e4ab9e1454

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page