In-memory vector store with multi-metric similarity search.
Project description
philiprehberger-embedding-store
In-memory vector store with multi-metric similarity search.
Installation
pip install philiprehberger-embedding-store
Usage
from philiprehberger_embedding_store import VectorStore
store = VectorStore(dimensions=1536)
# Add vectors with metadata
store.add("doc1", embedding=[0.1, 0.2, ...], metadata={"title": "First doc"})
store.add("doc2", embedding=[0.3, 0.1, ...], metadata={"title": "Second doc"})
# Search by similarity
results = store.search(query_embedding=[0.15, 0.18, ...], top_k=5)
for result in results:
print(f"{result.id}: score={result.score:.3f}, {result.metadata}")
Distance metrics
Choose a metric per store or override per search call:
from philiprehberger_embedding_store import VectorStore
# Set default metric at store level
store = VectorStore(dimensions=128, metric="euclidean")
results = store.search(query, top_k=5)
# Override metric for a single search
results = store.search(query, top_k=5, metric="manhattan")
Supported metrics: "cosine" (default), "dot", "euclidean", "manhattan".
Metadata filtering
from philiprehberger_embedding_store import VectorStore
store = VectorStore()
store.add("d1", [1.0, 0.0], {"category": "docs", "lang": "en"})
store.add("d2", [0.9, 0.1], {"category": "code", "lang": "en"})
# Filter by single field
results = store.search(query, filter=lambda m: m["category"] == "docs")
# Filter by multiple conditions
results = store.search(
query,
filter=lambda m: m["category"] == "docs" and m["lang"] == "en",
)
Batch operations
from philiprehberger_embedding_store import VectorStore
store = VectorStore()
# Add many vectors at once
store.add_many([
("id1", [0.1, 0.2], {"label": "first"}),
("id2", [0.3, 0.4], {"label": "second"}),
])
# Search with multiple queries at once
all_results = store.search_many(
[query_embedding_1, query_embedding_2],
top_k=5,
)
Score a single entry
Use score() to compute the similarity between a stored entry and an arbitrary query vector without running a full top-k search — handy for re-ranking or one-off comparisons.
from philiprehberger_embedding_store import VectorStore
store = VectorStore(metric="cosine")
store.add("doc1", [1.0, 0.0, 0.0])
store.score("doc1", [1.0, 0.0, 0.0]) # 1.0
store.score("doc1", [0.0, 1.0, 0.0]) # ~0.0
store.score("doc1", [1.0, 1.0, 1.0], metric="dot") # 1.0
Persistence
from philiprehberger_embedding_store import VectorStore
store = VectorStore()
store.add("doc1", [0.1, 0.2], {"title": "Example"})
# Save to disk
store.save("vectors.json")
# Load from disk
loaded = VectorStore.load("vectors.json")
Store management
from philiprehberger_embedding_store import VectorStore
store = VectorStore()
store.add("a", [1.0, 0.0])
store.remove("a") # Remove by ID
store.clear() # Remove all entries
Updating and clearing
from philiprehberger_embedding_store import VectorStore
store = VectorStore(dimensions=3)
store.add("a", [1.0, 0.0, 0.0], {"version": 1})
# Replace the vector in place
store.update("a", vector=[0.0, 1.0, 0.0])
# Replace the metadata (wholesale)
store.update("a", metadata={"version": 2})
# Update both at once
store.update("a", vector=[0.0, 0.0, 1.0], metadata={"version": 3})
# Remove everything but keep the dimensionality (3) and metric configuration
store.clear()
assert len(store) == 0
store.add("b", [0.1, 0.2, 0.3]) # still constrained to 3 dimensions
API
| Function / Class | Description |
|---|---|
VectorStore(dimensions, metric?) |
Create a store with optional dimensionality and metric |
add(id, embedding, metadata?) |
Add a vector with optional metadata |
add_many(items) |
Batch add multiple vectors |
search(query, top_k?, metric?, filter?, min_score?) |
Similarity search |
search_many(queries, top_k?, metric?, filter?, min_score?) |
Batch similarity search |
score(id, query, metric?) |
Compute similarity between a stored entry and a query vector |
get(id) |
Get entry by ID |
delete(id) |
Delete entry by ID |
remove(id) |
Remove entry by ID (alias for delete) |
update_metadata(id, metadata) |
Update metadata for an entry |
update(id, vector=None, metadata=None) |
Replace an entry's vector and/or metadata in place |
save(path) |
Save store to JSON file |
VectorStore.load(path) |
Load store from JSON file |
clear() |
Remove all entries (preserves dimensionality and metric) |
ids() |
List all stored IDs |
len(store) |
Number of entries |
id in store |
Check if ID exists |
store.size |
Number of entries (property) |
store.metric |
Current distance metric (property) |
Development
pip install -e .
python -m pytest tests/ -v
Support
If you find this project useful:
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file philiprehberger_embedding_store-0.5.0.tar.gz.
File metadata
- Download URL: philiprehberger_embedding_store-0.5.0.tar.gz
- Upload date:
- Size: 182.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cac4568c75ef8490a016d9b20086470dffaddce59d1d36ca78183e27e7fe1a26
|
|
| MD5 |
7146954b956fe3edfcbf02fcaf046303
|
|
| BLAKE2b-256 |
df9d867e8ac51adb8c3f9a0fd7b87a1f4609c5183c4b200beda33f4dc39e9cee
|
File details
Details for the file philiprehberger_embedding_store-0.5.0-py3-none-any.whl.
File metadata
- Download URL: philiprehberger_embedding_store-0.5.0-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
095012f9b36a292c9cf9c7035a0ad23dd0dc5b74972a38a49ac7fe9f8afc0beb
|
|
| MD5 |
1dd96d8b1ebd71a0476ca62ac4bb0a51
|
|
| BLAKE2b-256 |
53180660148f7e56a8e0f0ec82776b4984b7e5bb7710aefc6d4a728a2b4a6336
|