cache-dit

A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for 🤗DiTs.

These details have not been verified by PyPI

Project links

Project description

A PyTorch-native and Flexible Inference Engine with
Hybrid Cache Acceleration and Parallelism for 🤗DiTs

Baseline	SCM Slow	SCM Fast	SCM Ultra	+compile	+FP8*	+CP2
24.85s	15.4s	11.4s	8.2s	🎉7.1s	🎉4.5s	🎉2.9s

🤗Why Cache-DiT❓❓Cache-DiT is built on top of the Diffusers library and now supports nearly 🔥ALL DiTs from Diffusers, including over 🤗70+ DiTs. Please refer to our online documentation at readthedocs.io for more details. The optimizations made by Cache-DiT include:

🎉Hybrid Cache Acceleration (DBCache, DBPrune, TaylorSeer, SCM and more)
🎉Context Parallelism (w/ Ulysses, Ring, USP, Ulysses Anything, FP8 Comm)
🎉Tensor Parallelism (w/ PyTorch native DTensor and Tensor Parallelism APIs)
🎉Hybrid 2D and 3D Parallelism (Scale up the performance of 💥Large DiTs)
🎉Text Encoder Parallelism (TE-P w/ PyTorch native Tensor Parallelism APIs)
🎉Auto Encoder Parallelism (VAE-P w/ Tile Parallelism, faster, avoid OOM)
🎉ControlNet Parallelism (CN-P w/ Context Parallelism for ControlNet)
🎉Built-in HTTP serving deployment support with simple REST APIs
🎉Natively compatible with Compile, Offloading, Quantization, ...
🎉Integration into vLLM-Omni, SGLang Diffusion, SD.Next, ...
🎉Natively supports NVIDIA GPUs, Ascend NPUs (>= 1.2.0), ...

🔥Latest News

[2026/02] 🎉v1.2.1 release is ready, the major updates including: Ring Attention w/ batched P2P, USP (Hybrid Ring and Ulysses), Hybrid 2D and 3D Parallelism (💥USP + TP), VAE-P Comm overhead reduce.
[2026/01] 🎉v1.2.0 stable release is ready: New Models Support(Z-Image, FLUX.2, LTX-2, etc), Request level Cache Context, HTTP Serving, Ulysses Anything, TE-P, VAE-P, CN-P and Ascend NPUs support.

🚀Quick Start

You can install the cache-dit from PyPI or from source:

pip3 install -U cache-dit # or, pip3 install git+https://github.com/vipshop/cache-dit.git

Then accelerate your DiTs with just ♥️one line♥️ of code ~

>>> import cache_dit
>>> from diffusers import DiffusionPipeline
>>> # The pipe can be any diffusion pipeline.
>>> pipe = DiffusionPipeline.from_pretrained("Qwen/Qwen-Image")
>>> # Cache Acceleration with One-line code.
>>> cache_dit.enable_cache(pipe)
>>> # Or, Hybrid Cache Acceleration + 1D Parallelism.
>>> from cache_dit import DBCacheConfig, ParallelismConfig
>>> cache_dit.enable_cache(
...   pipe, cache_config=DBCacheConfig(), # w/ default
...   parallelism_config=ParallelismConfig(ulysses_size=2))
>>> # Or, Use Distributed Inference without Cache Acceleration.
>>> cache_dit.enable_cache(
...   pipe, parallelism_config=ParallelismConfig(ulysses_size=2))
>>> # Or, Hybrid Cache Acceleration + 2D Parallelism.
>>> cache_dit.enable_cache(
...   pipe, cache_config=DBCacheConfig(), # w/ default
...   parallelism_config=ParallelismConfig(ulysses_size=2, tp_size=2))
>>> from cache_dit import load_configs
>>> # Or, Load Acceleration config from a custom yaml file.
>>> cache_dit.enable_cache(pipe, **load_configs("config.yaml"))
>>> # Optional, set attention backend for better performance.
>>> cache_dit.set_attn_backend(pipe, attention_backend=...)
>>> output = pipe(...) # Just call the pipe as normal.

Please refer to our online documentation at readthedocs.io for more details.

🚀Quick Links

📊Examples - The easiest way to enable hybrid cache acceleration and parallelism for DiTs with cache-dit is to start with our examples for popular models: FLUX, Z-Image, Qwen-Image, Wan, etc.
🌐HTTP Serving - Deploy cache-dit models with HTTP API for text-to-image, image editing, multi-image editing, and text/image-to-video generation.
🎉User Guide - For more advanced features, please refer to the 🎉User Guide for details.
❓FAQ - Frequently asked questions including attention backend configuration, troubleshooting, and optimization tips.

🌐Community Integration

©️Acknowledgements

Special thanks to vipshop's Computer Vision AI Team for supporting document, testing and deployment of this project. We learned the design and reused code from the following projects: Diffusers, SGLang, vLLM-Omni, ParaAttention, xDiT, TaylorSeer and LeMiCa.

©️Citations

@misc{cache-dit@2025,
  title={cache-dit: A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.},
  url={https://github.com/vipshop/cache-dit.git},
  note={Open-source software available at https://github.com/vipshop/cache-dit.git},
  author={DefTruth, vipshop.com},
  year={2025}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.3.6

May 11, 2026

1.3.5

Mar 30, 2026

1.3.4

Mar 27, 2026

1.3.3

Mar 26, 2026

1.3.2

Mar 26, 2026

1.3.1

Mar 25, 2026

1.3.0

Mar 11, 2026

1.2.3

Feb 26, 2026

This version

1.2.2

Feb 10, 2026

1.2.1

Feb 2, 2026

1.2.0

Jan 16, 2026

1.1.10

Dec 31, 2025

1.1.9

Dec 22, 2025

1.1.8

Dec 10, 2025

1.1.7

Dec 6, 2025

1.1.6

Dec 5, 2025

1.1.5

Dec 5, 2025

1.1.4

Nov 28, 2025

1.1.3

Nov 28, 2025

1.1.2

Nov 24, 2025

1.1.1

Nov 19, 2025

1.1.0

Nov 18, 2025

1.0.16

Nov 17, 2025

1.0.15

Nov 13, 2025

1.0.14

Nov 11, 2025

1.0.13

Nov 7, 2025

1.0.12

Nov 7, 2025

1.0.11

Nov 5, 2025

1.0.10

Oct 30, 2025

1.0.9

Oct 24, 2025

1.0.8

Oct 22, 2025

1.0.7

Oct 22, 2025

1.0.6

Oct 20, 2025

1.0.5

Oct 15, 2025

1.0.4

Oct 14, 2025

1.0.3

Oct 12, 2025

1.0.2

Oct 10, 2025

1.0.1

Sep 26, 2025

1.0.0

Sep 25, 2025

0.3.3

Sep 23, 2025

0.3.2

Sep 22, 2025

0.3.1

Sep 19, 2025

0.3.0

Sep 17, 2025

0.2.37

Sep 17, 2025

0.2.36

Sep 16, 2025

0.2.34

Sep 12, 2025

0.2.33

Sep 10, 2025

0.2.32

Sep 8, 2025

0.2.31

Sep 8, 2025

0.2.30

Sep 5, 2025

0.2.29

Sep 4, 2025

0.2.28

Sep 3, 2025

0.2.27

Sep 1, 2025

0.2.26

Aug 29, 2025

0.2.25

Aug 28, 2025

0.2.24

Aug 26, 2025

0.2.23

Aug 25, 2025

0.2.22

Aug 25, 2025

0.2.21

Aug 22, 2025

0.2.20

Aug 21, 2025

0.2.19

Aug 20, 2025

0.2.18

Aug 20, 2025

0.2.17

Aug 19, 2025

0.2.16

Aug 15, 2025

0.2.15

Aug 11, 2025

0.2.14

Aug 5, 2025

0.2.13

Jul 30, 2025

0.2.12

Jul 24, 2025

0.2.11

Jul 21, 2025

0.2.10

Jul 17, 2025

0.2.9

Jul 13, 2025

0.2.8

Jul 11, 2025

0.2.7

Jul 10, 2025

0.2.6

Jul 9, 2025

0.2.5

Jul 9, 2025

0.2.4

Jul 3, 2025

0.2.3

Jul 1, 2025

0.2.2

Jun 30, 2025

0.2.1

Jun 22, 2025

0.2.0

Jun 20, 2025

0.1.8

Jun 20, 2025

0.1.7

Jun 18, 2025

0.1.6

Jun 18, 2025

0.1.5

Jun 18, 2025

0.1.3

Jun 17, 2025

0.1.2

Jun 17, 2025

0.1.1

Jun 17, 2025

0.1.1.dev2 pre-release

Jun 16, 2025

0.1.0

Jun 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cache_dit-1.2.2-py3-none-any.whl (347.1 kB view details)

Uploaded Feb 10, 2026 Python 3

File details

Details for the file cache_dit-1.2.2-py3-none-any.whl.

File metadata

Download URL: cache_dit-1.2.2-py3-none-any.whl
Upload date: Feb 10, 2026
Size: 347.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for cache_dit-1.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f1609511638394a75d7732b5e86c3a7bd93b47907e687a53e16b2b852a4973d7`
MD5	`a346cf83dfba34245c4fb893e57ca1c3`
BLAKE2b-256	`69ac28c287a8236b995e6f2ca06db99c4a465ebca3ec929f23e162262e862591`

See more details on using hashes here.

cache-dit 1.2.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

A PyTorch-native and Flexible Inference Engine with
Hybrid Cache Acceleration and Parallelism for 🤗DiTs

🔥Latest News

🚀Quick Start

🚀Quick Links

🌐Community Integration

©️Acknowledgements

©️Citations

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes