A chromadb embeddings plugin for OVOS
Project description
ovos-chromadb-embeddings-plugin
ChromaDB-backed EmbeddingsDB vector store plugin for OpenVoiceOS.
Install
pip install ovos-chromadb-embeddings-plugin
What is an EmbeddingsDB?
EmbeddingsDB is the abstract base class from ovos-plugin-manager for vector stores.
Plugins implementing it are discovered automatically by OPM under the entry-point group opm.embeddings.
This plugin registers as:
opm.embeddings → ovos-chromadb-embeddings-plugin → ChromaEmbeddingsDB
OVOS subsystems call OVOSPluginFactory.get_plugin("opm.embeddings") to obtain a
configured store without coupling to a specific backend, so any EmbeddingsDB plugin
(ChromaDB here, or e.g. qdrant)
is a drop-in swap.
Where this fits in OVOS
This plugin is the vector store half of the stack — it stores and searches vectors but does not produce them. Pair it with an embedding producer such as ovos-gguf-embeddings-plugin (text → vectors), or the face / voice embedders.
Concrete consumers that can be backed by this store:
| Consumer | Uses the store for |
|---|---|
| ovos-persona-server | RAG: the OpenAI-compatible Files / Vector-Stores / /search endpoints |
| ovos-memory-plugins | long-term semantic memory for a persona |
| face / voice recognition | nearest-neighbour identity lookup over enrolment vectors |
It is local-first: in persistent mode it runs fully offline on a CPU with no server.
Quickstart
import tempfile, numpy as np
from ovos_chromadb_embeddings import ChromaEmbeddingsDB
with tempfile.TemporaryDirectory() as tmp:
db = ChromaEmbeddingsDB(config={"path": tmp})
# Store a few 4-d vectors
db.add_embeddings("apple", np.array([0.9, 0.1, 0.0, 0.0]))
db.add_embeddings("banana", np.array([0.0, 0.9, 0.1, 0.0]))
db.add_embeddings("cherry", np.array([0.0, 0.0, 0.9, 0.1]))
# Nearest-neighbour query
query = np.array([0.85, 0.15, 0.0, 0.0])
results = db.query(query, top_k=2)
# → [("apple", 0.003...), ("banana", 0.45...)]
print(results[0][0]) # "apple"
query returns (id, distance) tuples ordered nearest-first. The score is a distance,
not a similarity — lower is closer for the default cosine metric (and for l2). Change
the metric with hnsw:space (see Configuration). The query vector must have
the same dimensionality as the stored vectors.
Configuration
Pass a config dict to ChromaEmbeddingsDB(config=...) or set it in your OVOS
configuration under the plugin key.
| Key | Type | Default | Description |
|---|---|---|---|
path |
str |
"./chromadb_storage" |
Local persistence directory (PersistentClient mode). |
host |
str |
— | Remote ChromaDB server host. When set, uses HttpClient instead of PersistentClient. |
port |
int |
8000 |
Port for the remote ChromaDB server (HttpClient mode only). |
default_collection_name |
str |
"embeddings" |
Name of the collection created/used on init. |
hnsw:space |
str |
"cosine" |
Distance metric for HNSW index. Accepted: "cosine", "l2", "ip". Set via collection metadata. |
Local (persistent) mode
db = ChromaEmbeddingsDB(config={"path": "/var/lib/ovos/chromadb"})
Remote server mode
db = ChromaEmbeddingsDB(config={"host": "192.168.1.10", "port": 8000})
API overview
| Method | Description |
|---|---|
add_embeddings(key, embedding, metadata, collection_name) |
Upsert a single vector. |
add_embeddings_batch(keys, embeddings, metadata, collection_name) |
Upsert a list of vectors. |
get_embeddings(key, collection_name, return_metadata) |
Retrieve a vector by key. |
get_embeddings_batch(keys, collection_name, return_metadata) |
Retrieve multiple vectors. |
delete_embeddings(key, collection_name) |
Delete a vector by key. |
delete_embeddings_batch(keys, collection_name) |
Delete multiple vectors. |
query(embedding, top_k, return_metadata, collection_name) |
ANN search; returns [(id, distance)]. |
create_collection(name, metadata) |
Create (or get) a named collection. |
get_collection(name) |
Retrieve a collection handle (raises ValueError if absent). |
delete_collection(name) |
Drop a collection. |
list_collections() |
List all collections. |
count_embeddings_in_collection(collection_name) |
Count stored vectors. |
Documentation
docs/configuration.md— full config reference (local vs remote, distance metrics)docs/usage.md— collections, CRUD, batch ops, metadata, numpy in/out
Examples
examples/quickstart.py— add vectors + queryexamples/collections.py— multi-collection workflowexamples/remote_server.py— HttpClient usage
Testing
pip install -e ".[test]"
pytest test/ -v
The test suite uses a temporary PersistentClient with no network access.
test/test_e2e.py runs a real end-to-end flow (add → query → verify nearest neighbour)
using a small deterministic local embedder so it passes in CI without model downloads.
Credits
Originally developed by TigreGótico for OpenVoiceOS, sponsored by VisioLab. Modernized under the NGI0 Commons Fund / NLnet.
This work was sponsored by VisioLab, part of Royal Dutch Visio, is the test, education, and research center in the field of (innovative) assistive technology for blind and visually impaired people and professionals. We explore (new) technological developments such as Voice, VR and AI and make the knowledge and expertise we gain available to everyone.
This project was funded through the NGI0 Commons Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 101135429.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ovos_chromadb_embeddings_plugin-0.3.0a4.tar.gz.
File metadata
- Download URL: ovos_chromadb_embeddings_plugin-0.3.0a4.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a82e1c2ac300097097ef23f35473b71b635908fd22bec07c780ca30b1b089ea
|
|
| MD5 |
e274030bae532f36a6f6edf1dca4cf0b
|
|
| BLAKE2b-256 |
44f945dc82346a09834b2c9e242ee79173784d09bfdee26bd32a31f3167428ab
|
File details
Details for the file ovos_chromadb_embeddings_plugin-0.3.0a4-py3-none-any.whl.
File metadata
- Download URL: ovos_chromadb_embeddings_plugin-0.3.0a4-py3-none-any.whl
- Upload date:
- Size: 10.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88ec0579b908ee3c33d59f8b35aad37084099cb547ff81bed297635821fc142d
|
|
| MD5 |
805103837d7a4e5f9b2f5c2419260cbb
|
|
| BLAKE2b-256 |
7a2204cf3eb1b436452b4c32a9b94b41c98d4ce3f402bed73a1d44534690fc65
|