8 projects
colsearch
ColSearch — production late-interaction retrieval for ColBERT and ColPali style workloads. Single-node CPU/GPU, Triton MaxSim, 1.58-bit ROQ quantization at 40 B/token, BM25 hybrid search, durable CRUD/WAL, multimodal preprocessing, and a base64-ready reference API.
voyager-index
Shard-first late-interaction retrieval for ColBERT and ColPali style workloads with CPU/GPU modes, Triton MaxSim, BM25 hybrid search, durable CRUD/WAL, multimodal preprocessing, and base64-ready reference APIs.
latence-solver
CPU-reference-first Tabu Search Quadratic Knapsack solver with optional accelerator hooks
latence-shard-engine
Lock-free shard engine for multi-vector retrieval — Rust data plane with GPU scoring callbacks
vllm-factory
The LEGO set for custom vLLM model plugins — build, test, and deploy custom encoders, poolers, and kernels
latence-hnsw
Qdrant HNSW Indexer Wrapper
latence-gem-router
GEM-inspired set-native multi-vector routing core for ColBERT and ColPali
latence
Official Python SDK for Latence AI API