Last released Mar 9, 2026
Expert-Aware Multi-Batch Pipeline for MoE + Speculative Decoding inference optimization (CPU-PCIe-GPU).
Supported by