Last released Apr 16, 2026
Up to 5x faster Qwen3-TTS inference through Triton kernel fusion
Last released Apr 6, 2026
Up to 3.4x faster OmniVoice inference through Triton kernel fusion and CUDA Graph optimization
Supported by