Oven logo

Oven

A PyTorch-native and Flexible Inference Engine with
Hybrid Cache Acceleration and Parallelism for 🤗DiTs
Featured|HelloGitHub

BaselineSCM S S*SCM F D*SCM U D*+TS+compile+FP8*
24.85s15.4s11.4s8.2s8.2s🎉7.1s🎉4.5s

Scheme: DBCache + SCM(steps_computation_mask) + TS(TaylorSeer) + FP8*, L20x1, S*: static cache,
D*: dynamic cache, S: Slow, F: Fast, U: Ultra Fast, TS: TaylorSeer, FP8*: FP8 DQ + Sage, FLUX.1-Dev

<img src=https://github.com/vipshop/cache-dit/raw/main/assets/speedup_v4.png>

SGLang Diffusion x Cache-DiT News vLLM Omni x Cache-DiT News

🔥Hightlight

We are excited to announce that the 🎉v1.1.0 version of cache-dit has finally been released! It brings 🔥Context Parallelism and 🔥Tensor Parallelism to cache-dit, thus making it a PyTorch-native and Flexible Inference Engine for 🤗DiTs. Key features: Unified Cache APIs, Forward Pattern Matching, Block Adapter, DBCache, DBPrune, Cache CFG, TaylorSeer, SCM, Context Parallelism (w/ UAA), Tensor Parallelism and 🎉SOTA performance.

pip3 install -U cache-dit # Also, pip3 install git+https://github.com/huggingface/diffusers.git (latest)

You can install the stable release of cache-dit from PyPI, or the latest development version from GitHub. Then try ♥️ Cache Acceleration with just one line of code ~ ♥️

>>> import cache_dit
>>> from diffusers import DiffusionPipeline
>>> pipe = DiffusionPipeline.from_pretrained("Qwen/Qwen-Image") # Can be any diffusion pipeline
>>> cache_dit.enable_cache(pipe) # One-line code with default cache options.
>>> output = pipe(...) # Just call the pipe as normal.
>>> stats = cache_dit.summary(pipe) # Then, get the summary of cache acceleration stats.
>>> cache_dit.disable_cache(pipe) # Disable cache and run original pipe.

📚Core Features

🔥Supported DiTs

[!Tip] One Model Series may contain many pipelines. cache-dit applies optimizations at the Transformer level; so, any pipelines that include the supported transformer are already supported by cache-dit. ✅: known work and official supported now; ✖️: unofficial supported now, but maybe support in the future; Q: 4-bits models w/ nunchaku W4A4; TE: Text Encoder Parallelism; 💡C*: Hybrid Cache Acceleration.

📚ModelC*CPTPTE📚ModelC*CPTPTE
🔥Z-Image🔥Z-Image-Control✖️✖️
🔥Ovis-Image🔥HuyuanVideo 1.5✖️✖️
🔥FLUX.2🎉FLUX.1 Q✖️
🎉FLUX.1🎉Qwen-Image Q✖️
🎉Qwen-Image🎉Qwen...Edit Q✖️
🎉Qwen...Edit🎉Qwen.E.Plus Q✖️
🎉Qwen..Light🎉Qwen...Light Q✖️
🎉Wan 2.2 T2V/ITV 🎉Qwen.E.Light Q✖️
🎉Wan 2.2 VACE🎉Mochi✖️
🎉Wan 2.1 T2V/ITV🎉HiDream✖️✖️
🎉Wan 2.1 VACE🎉HunyuanDiT✖️
🎉HunyuanVideo🎉Sana✖️✖️
🎉ChronoEdit🎉Bria✖️✖️
🎉CogVideoX🎉SkyReelsV2
🎉CogVideoX 1.5🎉Lumina 1/2✖️
🎉CogView4🎉DiT-XL✖️
🎉CogView3Plus🎉Allegro✖️✖️
🎉PixArt Sigma🎉Cosmos✖️✖️
🎉PixArt Alpha🎉OmniGen✖️✖️
🎉Chroma-HD️✅🎉EasyAnimate✖️✖️
🎉VisualCloze🎉StableDiffusion3✖️✖️
🎉HunyuanImage🎉PRX T2I✖️✖️
🎉Kandinsky5✅️✅️🎉Amused✖️✖️
🎉LTXVideo🎉AuraFlow✖️✖️
🎉ConsisID🎉LongCatVideo✖️✖️
🔥Click here to show many Image/Video cases🔥

🎉Now, cache-dit covers almost All Diffusers' DiT Pipelines🎉
🔥Qwen-Image | Qwen-Image-Edit | Qwen-Image-Edit-Plus 🔥
🔥FLUX.1 | Qwen-Image-Lightning 4/8 Steps | Wan 2.1 | Wan 2.2 🔥
🔥HunyuanImage-2.1 | HunyuanVideo | HunyuanDiT | HiDream | AuraFlow🔥
🔥CogView3Plus | CogView4 | LTXVideo | CogVideoX | CogVideoX 1.5 | ConsisID🔥
🔥Cosmos | SkyReelsV2 | VisualCloze | OmniGen 1/2 | Lumina 1/2 | PixArt🔥
🔥Chroma | Sana | Allegro | Mochi | SD 3/3.5 | Amused | ... | DiT-XL🔥

🔥Wan2.2 MoE | +cache-dit:2.0x↑🎉 | HunyuanVideo | +cache-dit:2.1x↑🎉

🔥Qwen-Image | +cache-dit:1.8x↑🎉 | FLUX.1-dev | +cache-dit:2.1x↑🎉

🔥Qwen...Lightning | +cache-dit:1.14x↑🎉 | HunyuanImage | +cache-dit:1.7x↑🎉

🔥Qwen-Image-Edit | Input w/o Edit | Baseline | +cache-dit:1.6x↑🎉 | 1.9x↑🎉

🔥FLUX-Kontext-dev | Baseline | +cache-dit:1.3x↑🎉 | 1.7x↑🎉 | 2.0x↑ 🎉

🔥HiDream-I1 | +cache-dit:1.9x↑🎉 | CogView4 | +cache-dit:1.4x↑🎉 | 1.7x↑🎉

🔥CogView3 | +cache-dit:1.5x↑🎉 | 2.0x↑🎉| Chroma1-HD | +cache-dit:1.9x↑🎉

🔥Mochi-1-preview | +cache-dit:1.8x↑🎉 | SkyReelsV2 | +cache-dit:1.6x↑🎉

🔥VisualCloze-512 | Model | Cloth | Baseline | +cache-dit:1.4x↑🎉 | 1.7x↑🎉

🔥LTX-Video-0.9.7 | +cache-dit:1.7x↑🎉 | CogVideoX1.5 | +cache-dit:2.0x↑🎉

🔥OmniGen-v1 | +cache-dit:1.5x↑🎉 | 3.3x↑🎉 | Lumina2 | +cache-dit:1.9x↑🎉

🔥Allegro | +cache-dit:1.36x↑🎉 | AuraFlow-v0.3 | +cache-dit:2.27x↑🎉

🔥Sana | +cache-dit:1.3x↑🎉 | 1.6x↑🎉| PixArt-Sigma | +cache-dit:2.3x↑🎉

🔥PixArt-Alpha | +cache-dit:1.6x↑🎉 | 1.8x↑🎉| SD 3.5 | +cache-dit:2.5x↑🎉

🔥Asumed | +cache-dit:1.1x↑🎉 | 1.2x↑🎉 | DiT-XL-256 | +cache-dit:1.8x↑🎉
♥️ Please consider to leave a ⭐️ Star to support us ~ ♥️

📖Table of Contents

For more advanced features such as Unified Cache APIs, Forward Pattern Matching, Automatic Block Adapter, Hybrid Forward Pattern, Patch Functor, DBCache, DBPrune, TaylorSeer Calibrator, SCM, Hybrid Cache CFG, Context Parallelism (w/ UAA) and Tensor Parallelism, please refer to the 🎉User_Guide.md for details.

🚀Quick Links

  • 📊Examples - The easiest way to enable hybrid cache acceleration and parallelism for DiTs with cache-dit is to start with our examples for popular models: FLUX, Z-Image, Qwen-Image, Wan, etc.
  • 🌐HTTP Serving - Deploy cache-dit models with HTTP API for text-to-image, image editing, multi-image editing, and text-to-video generation.
  • ❓FAQ - Frequently asked questions including attention backend configuration, troubleshooting, and optimization tips.

📚Documentation

👋Contribute

How to contribute? Star ⭐️ this repo to support us or check CONTRIBUTE.md.

🎉Projects Using CacheDiT

Here is a curated list of open-source projects integrating CacheDiT, including popular repositories like jetson-containers, flux-fast, sdnext, 🔥vLLM-Omni, and 🔥SGLang Diffusion. 🎉CacheDiT has been recommended by many famous opensource projects: 🔥Z-Image, 🔥Wan 2.2, 🔥Qwen-Image, 🔥LongCat-Video, Qwen-Image-Lightning, Kandinsky-5, LeMiCa, 🤗diffusers, HelloGitHub and GaintPandaCV.

©️Acknowledgements

Special thanks to vipshop's Computer Vision AI Team for supporting document, testing and production-level deployment of this project. We learned the design and reused code from the following projects: 🤗diffusers, SGLang, ParaAttention, xDiT, TaylorSeer and LeMiCa.

©️Citations

@misc{cache-dit@2025,
  title={cache-dit: A PyTorch-native and Flexible Inference Engine with Hybrid Cache Acceleration and Parallelism for DiTs.},
  url={https://github.com/vipshop/cache-dit.git},
  note={Open-source software available at https://github.com/vipshop/cache-dit.git},
  author={DefTruth, vipshop.com},
  year={2025}
}