Last released Mar 29, 2026
First open-source implementation of TurboQuant (arXiv 2504.19874) — 4-7x LLM KV cache compression
Supported by