Last released May 31, 2026
Eigenspectral KV cache compression for transformer inference. Up to 6.55x compression with FP16-equivalent quality, drop-in for HuggingFace LLMs and vision transformers.
Supported by