Provider-agnostic embedding functions for ChromaDB with OpenRouter and local fallback support.
Project description
chromaroute
Provider-agnostic embedding functions for ChromaDB with automatic fallback support.
Features
- ChromaDB-native interface: Drop-in
EmbeddingFunctionimplementations - Provider fallback chain: OpenRouter → Local (SentenceTransformers)
- OpenRouter integration: Full support for OpenRouter's embedding API with provider routing
- Production-ready: Comprehensive error handling, configurable timeouts, actionable error messages
Installation
pip install chromaroute
# With local embeddings (SentenceTransformers)
pip install chromaroute[local]
Quick Start
from chromaroute import build_embedding_function, load_config
# Auto-detect available providers
config = load_config()
embed_fn = build_embedding_function(config)
# Or rely on environment auto-detection
embed_fn = build_embedding_function()
# Use with ChromaDB
import chromadb
client = chromadb.EphemeralClient()
collection = client.create_collection(
name="my_collection",
embedding_function=embed_fn,
)
collection.add(documents=["Hello world"], ids=["doc1"])
Configuration
Set environment variables:
# OpenRouter (primary)
OPENROUTER_API_KEY=sk-or-...
OPENROUTER_EMBEDDINGS_MODEL=openai/text-embedding-3-small
OPENROUTER_EMBED_PROVIDER_JSON='{"order":["openai","mistral"],"allow_fallbacks":true}'
# Local fallback uses sentence-transformers/all-MiniLM-L6-v2 by default
Direct OpenRouter Usage
from chromaroute import OpenRouterEmbeddingFunction
embed_fn = OpenRouterEmbeddingFunction(
model="openai/text-embedding-3-small",
api_key="sk-or-...",
)
# Use with ChromaDB
embeddings = embed_fn(["text to embed"])
# Returns list[list[float]] with one embedding per input text.
VectorStore (Optional)
For simplified collection management with automatic batching:
from chromaroute import VectorStore
store = VectorStore("my_docs", persist_path="./chroma_db")
store.add_documents(
documents=["Hello world", "Goodbye world"],
metadatas=[{"source_id": "doc_a"}, {"source_id": "doc_b"}],
)
# Flat single-query result (lists per field)
result = store.query(
query_texts="greeting", # Accepts string or list[str]
n_results=2,
include=["documents", "metadatas", "distances"],
)
# Access first (and only) query's rows from nested results
top_docs = result["documents"][0]
# Row-like convenience result
records = store.query_one_records(
"greeting",
n_results=2,
where={"source_id": "doc_a"}, # Filter by metadata
include=["documents", "metadatas", "distances"],
)
query()/get() pass include, where, and where_document directly to ChromaDB.
If include is omitted, ChromaDB defaults are used.
Common Recipes
- Provenance-aware retrieval: Store
source_idin metadata during ingest and include"metadatas"at query time. Filter by it usingwhere={"source_id": "..."}. - Application-ready rows: Use
query_one_records()when you want id/document/distance/metadata bundled into one object per hit. - Stale data removal: Use
store.delete(where={"source_id": "..."})before re-ingesting a previously processed document source to prevent duplicates. - Advanced ChromaDB features: For partial updates or upserts, use the underlying
Chroma collection directly via
store.collection.
Environment Variables
| Variable | Default | Description |
|---|---|---|
OPENROUTER_API_KEY |
— | OpenRouter API key (enables OpenRouter provider) |
OPENROUTER_BASE_URL |
https://openrouter.ai/api/v1 |
Override OpenRouter base URL (advanced) |
OPENROUTER_EMBEDDINGS_MODEL |
openai/text-embedding-3-small |
Model for OpenRouter embeddings |
OPENROUTER_EMBED_PROVIDER_JSON |
— | Provider routing config (JSON) |
LOCAL_EMBEDDINGS_MODEL |
sentence-transformers/all-MiniLM-L6-v2 |
Model for local embeddings |
EMBED_PROVIDER |
auto |
Force provider: auto, openrouter, or local |
Advanced Usage (Best-Effort)
chromaroute is optimized for OpenRouter, but includes a few intentional escape hatches for custom setups. These are not the primary path and are supported on a best-effort basis. See docs/advanced.md.
License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chromaroute-0.4.0.tar.gz.
File metadata
- Download URL: chromaroute-0.4.0.tar.gz
- Upload date:
- Size: 152.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
903fabecfc99ab0e38a5fcc3d50765d5d855815b9665abb89535ef90864a973b
|
|
| MD5 |
a66bb0065fa5f4947a72701869378593
|
|
| BLAKE2b-256 |
42846575a2dd2f30848edd06a99e275a5ae0fcb4002f57f043ff0c900304cc2d
|
File details
Details for the file chromaroute-0.4.0-py3-none-any.whl.
File metadata
- Download URL: chromaroute-0.4.0-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e8ea7723bd78e04a06ae7c384ea14932f3751473591dfbf143ca7e4991ed6b2
|
|
| MD5 |
0ed43f2b2c1fab932a455e1066163da4
|
|
| BLAKE2b-256 |
f4be830c729f2ee0ecc02f3871afd770cf270899f10103c80082b76c0f9743c4
|