Last released Apr 12, 2026
Single-header LLM inference engine with KV cache compression (7× compression at fp32 parity)
Last released Mar 31, 2026
Index-Heavy, Query-Light RAG Engine — Put in docs, ask questions, it just works.
Supported by