Last released Feb 23, 2024
Ouroboros
Last released Feb 5, 2024
Dual Decoding
Last released Jan 31, 2024
A high-throughput and memory-efficient inference and serving engine for LLMs
Supported by