Profile of QuantumAI

Some features may not work without JavaScript. Please try enabling it if you encounter problems.

2 projects

Last released Apr 12, 2026

Single-header LLM inference engine with KV cache compression (7× compression at fp32 parity)

Last released Mar 31, 2026

Index-Heavy, Query-Light RAG Engine — Put in docs, ask questions, it just works.

Supported by