Skip to main content

In-memory vector store with multi-metric similarity search.

Project description

philiprehberger-embedding-store

Tests PyPI version Last updated

In-memory vector store with multi-metric similarity search.

Installation

pip install philiprehberger-embedding-store

Usage

from philiprehberger_embedding_store import VectorStore

store = VectorStore(dimensions=1536)

# Add vectors with metadata
store.add("doc1", embedding=[0.1, 0.2, ...], metadata={"title": "First doc"})
store.add("doc2", embedding=[0.3, 0.1, ...], metadata={"title": "Second doc"})

# Search by similarity
results = store.search(query_embedding=[0.15, 0.18, ...], top_k=5)
for result in results:
    print(f"{result.id}: score={result.score:.3f}, {result.metadata}")

Distance metrics

Choose a metric per store or override per search call:

from philiprehberger_embedding_store import VectorStore

# Set default metric at store level
store = VectorStore(dimensions=128, metric="euclidean")
results = store.search(query, top_k=5)

# Override metric for a single search
results = store.search(query, top_k=5, metric="manhattan")

Supported metrics: "cosine" (default), "dot", "euclidean", "manhattan".

Metadata filtering

from philiprehberger_embedding_store import VectorStore

store = VectorStore()
store.add("d1", [1.0, 0.0], {"category": "docs", "lang": "en"})
store.add("d2", [0.9, 0.1], {"category": "code", "lang": "en"})

# Filter by single field
results = store.search(query, filter=lambda m: m["category"] == "docs")

# Filter by multiple conditions
results = store.search(
    query,
    filter=lambda m: m["category"] == "docs" and m["lang"] == "en",
)

Batch operations

from philiprehberger_embedding_store import VectorStore

store = VectorStore()

# Add many vectors at once
store.add_many([
    ("id1", [0.1, 0.2], {"label": "first"}),
    ("id2", [0.3, 0.4], {"label": "second"}),
])

# Search with multiple queries at once
all_results = store.search_many(
    [query_embedding_1, query_embedding_2],
    top_k=5,
)

Persistence

from philiprehberger_embedding_store import VectorStore

store = VectorStore()
store.add("doc1", [0.1, 0.2], {"title": "Example"})

# Save to disk
store.save("vectors.json")

# Load from disk
loaded = VectorStore.load("vectors.json")

Store management

from philiprehberger_embedding_store import VectorStore

store = VectorStore()
store.add("a", [1.0, 0.0])

store.remove("a")      # Remove by ID
store.clear()           # Remove all entries

Updating and clearing

from philiprehberger_embedding_store import VectorStore

store = VectorStore(dimensions=3)
store.add("a", [1.0, 0.0, 0.0], {"version": 1})

# Replace the vector in place
store.update("a", vector=[0.0, 1.0, 0.0])

# Replace the metadata (wholesale)
store.update("a", metadata={"version": 2})

# Update both at once
store.update("a", vector=[0.0, 0.0, 1.0], metadata={"version": 3})

# Remove everything but keep the dimensionality (3) and metric configuration
store.clear()
assert len(store) == 0
store.add("b", [0.1, 0.2, 0.3])  # still constrained to 3 dimensions

API

Function / Class Description
VectorStore(dimensions, metric?) Create a store with optional dimensionality and metric
add(id, embedding, metadata?) Add a vector with optional metadata
add_many(items) Batch add multiple vectors
search(query, top_k?, metric?, filter?, min_score?) Similarity search
search_many(queries, top_k?, metric?, filter?, min_score?) Batch similarity search
get(id) Get entry by ID
delete(id) Delete entry by ID
remove(id) Remove entry by ID (alias for delete)
update_metadata(id, metadata) Update metadata for an entry
update(id, vector=None, metadata=None) Replace an entry's vector and/or metadata in place
save(path) Save store to JSON file
VectorStore.load(path) Load store from JSON file
clear() Remove all entries (preserves dimensionality and metric)
ids() List all stored IDs
len(store) Number of entries
id in store Check if ID exists
store.size Number of entries (property)
store.metric Current distance metric (property)

Development

pip install -e .
python -m pytest tests/ -v

Support

If you find this project useful:

Star the repo

🐛 Report issues

💡 Suggest features

❤️ Sponsor development

🌐 All Open Source Projects

💻 GitHub Profile

🔗 LinkedIn Profile

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

philiprehberger_embedding_store-0.4.0.tar.gz (181.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file philiprehberger_embedding_store-0.4.0.tar.gz.

File metadata

File hashes

Hashes for philiprehberger_embedding_store-0.4.0.tar.gz
Algorithm Hash digest
SHA256 640b54905f0e87c3496f67804fb61165966336a034b587a1dbbe286bedbd6ae9
MD5 6686defc8073ce207c10bbfc64397f92
BLAKE2b-256 0d456aa3740d8b2bc3012553ab6560b7165ed322ad1aa8211d0d53372a6f0f80

See more details on using hashes here.

File details

Details for the file philiprehberger_embedding_store-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for philiprehberger_embedding_store-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 efebef3f68cae33e7cd8996eca7ca0dc23aff1ccf41c2a102f262b3013a19ab9
MD5 c193810a0f92ea482114c554b0bedd6d
BLAKE2b-256 2916a9a3fb21d0c197820a5cf1b5d235bc83eff822268a69af7c017582b592ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page