Last released Aug 24, 2025
DFloat11: Fast and memory-efficient GPU inference for losslessly compressed LLMs and diffusion models
Last released Feb 24, 2025
The inference kernels for LeanQuant models.
Supported by