Last released Mar 15, 2026
Hardware-aware CLI that selects the best runtime and quantization for efficient LLM inference.
Supported by