5 projects
libembedding
Fast ONNX-based text, image, and sparse embeddings for Python. 5-8x faster than fastembed.
fractionally
Persistent memory layer for LLM agents. Zero LLM calls, sub-100ms ingestion, deterministic extraction.
untoken
Token compression for LLM prompts
orkestra-router
Smart LLM routing across providers - automatically picks the most cost-efficient model for your prompt
noniml
Noni — a tiny tensor library with autograd, for humans.