3 projects
ffpa-attn
FFPA: Yet another Faster Flash Prefill Attention for large headdim, 1.5~3x faster than SDPA.
cache-dit-cu13
Cache-DiT: A PyTorch-native Inference Engine with Cache, Parallelism and Quantization for Diffusion Transformers.
cache-dit
Cache-DiT: A PyTorch-native Inference Engine with Cache, Parallelism and Quantization for Diffusion Transformers.