Profile of ARBI CITY

Some features may not work without JavaScript. Please try enabling it if you encounter problems.

2 projects

turbo-attn

Last released Jul 28, 2026

Optimized CUDAgraph-enabled kernels and attention backend for vLLM, SGLang and more based on TurboQuant near-lossless KV cache compression. SOTA performance with Gemma 4, Qwen 3.6 and other modern LLMs.

arbi

Last released Apr 14, 2026

Python client for the ARBI API

Supported by

AWS Cloud computing and Security Sponsor

Datadog Monitoring

Depot Continuous Integration

Fastly CDN

Google Download Analytics

Pingdom Monitoring

Sentry Error logging

StatusPage Status page