13 projects
baseten-performance-client
A ultra-high performance package for sending requests to Baseten Embedding Inference'
truss-transfer
Speed up file transfers with the baseten.co + baseten_fs.
llm-runtime-metrics
Rust-backed performance metrics and request tracing
fastokens-b10
None
mm-cache-client
Tiny aiohttp client for the mm-cache distributed cache service
radix-mlp
RadixMLP: Prefix-based computation sharing for transformer models
infinity-client
A client library for accessing ♾️ Infinity - Embedding Inference Server
infinity-emb
Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip.
briton
Python component of using Briton
embed
A stable, fast and easy-to-use inference library with a focus on a sync-to-async API
gradientai
Gradient AI API
hf-hub-ctranslate2
Connecting Transfromers on HuggingfaceHub with CTranslate2.
rlskyjo
Multi-Agent Reinforcement Learning Environment for the card game SkyJo, compatible with PettingZoo and RLLIB