Last released Apr 25, 2026
TurboQuant+ compression for vLLM. 4.3x weight compression + 3.7x KV cache, zero calibration.
Last released Mar 28, 2026
Numpy-only TurboQuant vector quantization. No PyTorch, no CUDA.
Supported by