A fast, lightweight, and zero-setup in-memory vector store powered by NumPy
Project description
NumPy Vector Store
A fast, lightweight, zero-setup in-memory vector store powered by NumPy.
- Tiny local vector search for projects that do not need a vector database
- Fast exact cosine search using vectorized NumPy operations
- Simple typed API returning
VectorHit(index, value, metadata) - Composable filtering by passing prefiltered row indexes with
within_rows - Portable persistence as trusted local
.npzfiles withvectors+metadata - No framework opinions: bring your own embeddings, chunking, async, and metadata model
Why?
This library is purpose-built for small to medium-scale vector search tasks and offers a simple alternative to heavyweight vector databases when you do not need network services, indexing infrastructure, ingestion pipelines, or domain-specific metadata filtering.
When/Where?
Below are benchmark results for cosine similarity search to help you assess its suitability for your use case.
| Embedding Type | Dimensions | ~5ms | ~25ms | ~100ms | ~500ms |
|---|---|---|---|---|---|
| Sentence Transformers | 384 | 1K vectors 1.5MB |
10K vectors 15MB |
100K vectors 147MB |
500K vectors 732MB |
| OpenAI Small | 1536 | 500 vectors 3MB |
5K vectors 29MB |
25K vectors 147MB |
100K vectors 586MB |
| OpenAI Large | 3072 | 200 vectors 2MB |
2.5K vectors 29MB |
5K vectors 59MB |
25K vectors 293MB |
Benchmarks performed on Apple M2 hardware.
Installation
uv add numpy-vector-store
Quick Start
import numpy as np
from numpy_vector_store import VectorStore
store = VectorStore[dict[str, str]](dimensions=3)
store.add(
vectors=np.array([
[1.0, 0.0, 0.0],
[0.0, 1.0, 0.0],
[0.0, 0.0, 1.0],
]),
metadata=[
{"title": "x-axis"},
{"title": "y-axis"},
{"title": "z-axis"},
],
)
hits = store.cosine_search(
query=np.array([0.9, 0.1, 0.0]),
top_k=2,
)
for hit in hits:
print(f"{hit.metadata['title']}: {hit.value:.3f}")
metadata is an opaque row payload returned with hits. It can be a dict,
dataclass, string, integer row ID, or any other Python object that fits your
application.
Prefiltering
The store does not implement a metadata query language. To filter by metadata,
produce row indexes first, then pass them with within_rows.
rows = [
i
for i, metadata in enumerate(store.metadata)
if metadata["title"].startswith("x")
]
hits = store.cosine_search(query, top_k=10, within_rows=rows)
For structured NumPy metadata, use NumPy to produce the row indexes:
metadata_table = np.array(
[
("intro", "A", 2024),
("setup", "A", 2023),
("guide", "B", 2024),
],
dtype=[("title", "U20"), ("product", "U10"), ("year", "i4")],
)
store = VectorStore[int](dimensions=3)
store.add(vectors, metadata=np.arange(len(metadata_table)))
mask = (metadata_table["product"] == "A") & (metadata_table["year"] >= 2024)
rows = np.flatnonzero(mask)
hits = store.cosine_search(query, within_rows=rows)
for hit in hits:
row = metadata_table[hit.metadata]
print(row["title"], hit.value)
Persistence
Pass a file_path and call save() / load() explicitly:
store = VectorStore[dict[str, str]](dimensions=1536, file_path="vectors.npz")
store.add(embeddings, metadata)
store.save()
loaded = VectorStore[dict[str, str]](dimensions=1536, file_path="vectors.npz")
loaded.load()
Context manager usage auto-saves on exit:
with VectorStore[dict[str, str]](dimensions=1536, file_path="vectors.npz") as store:
store.add(embeddings, metadata)
Persistence uses a minimal NumPy .npz contract with vectors and metadata
arrays. Vectors are normalized when added or loaded, and similarity search only
normalizes the query vector. Loading validates shape, dimensions, row counts, and
zero-norm vectors. It also uses allow_pickle=True for flexible Python metadata
payloads, so only load files generated by your own application or another trusted
local process. Loading untrusted .npz files is not a supported security model.
Migrating from 0.1
The preferred 0.2 API is add(...) and cosine_search(...).
The 0.1 methods remain temporarily available, but emit DeprecationWarning and
will be removed in a future 0.x release:
store.add_vectors(vectors_2d, metadata_array)
results = store.search(query, top_k=3, score_cutoff=0.5)
for index, value, metadata in results:
...
Legacy search(...) keeps returning tuples. New cosine_search(...) returns
VectorHit objects.
metadata_schema was removed. For vectorized metadata filtering, keep metadata
in a sidecar table and pass matching row indexes with within_rows.
Contributing
git clone https://github.com/tvanreenen/numpy-vector-store.git
cd numpy-vector-store
uv sync --frozen --group dev
Before submitting a pull request:
- Run
uv run ruff check - Run
uv run ruff format --check - Run
uv run mypy src/ - Run
uv run pytest
License
MIT License - see LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file numpy_vector_store-0.2.0.tar.gz.
File metadata
- Download URL: numpy_vector_store-0.2.0.tar.gz
- Upload date:
- Size: 46.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8de7ff0bb626a0af64874e526006c7c15a864a51e68c2d4bb90d9d19027f5dd9
|
|
| MD5 |
691b915e0be9093deed3c3fa997c19e6
|
|
| BLAKE2b-256 |
d2ff16561de6425610ca120d10ea5c5dd7a9ea3765c694b71c1cb5aea5f9b6cb
|
Provenance
The following attestation bundles were made for numpy_vector_store-0.2.0.tar.gz:
Publisher:
publish-pypi.yml on tvanreenen/numpy-vector-store
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
numpy_vector_store-0.2.0.tar.gz -
Subject digest:
8de7ff0bb626a0af64874e526006c7c15a864a51e68c2d4bb90d9d19027f5dd9 - Sigstore transparency entry: 1538719813
- Sigstore integration time:
-
Permalink:
tvanreenen/numpy-vector-store@5aa96d0bf84352092a915a8d5b1b03ee6969ee4c -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/tvanreenen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@5aa96d0bf84352092a915a8d5b1b03ee6969ee4c -
Trigger Event:
release
-
Statement type:
File details
Details for the file numpy_vector_store-0.2.0-py3-none-any.whl.
File metadata
- Download URL: numpy_vector_store-0.2.0-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b4344a845daa61e81cc9eedd382ecc8dc1f5c4583bed8cdd558d1b8373c14d3
|
|
| MD5 |
6cc13af30e269d2cf090c5a6da94fa3c
|
|
| BLAKE2b-256 |
8f6d00af69219a5b76c85501565ba29b974be612dca239f360556ca096433c16
|
Provenance
The following attestation bundles were made for numpy_vector_store-0.2.0-py3-none-any.whl:
Publisher:
publish-pypi.yml on tvanreenen/numpy-vector-store
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
numpy_vector_store-0.2.0-py3-none-any.whl -
Subject digest:
8b4344a845daa61e81cc9eedd382ecc8dc1f5c4583bed8cdd558d1b8373c14d3 - Sigstore transparency entry: 1538719942
- Sigstore integration time:
-
Permalink:
tvanreenen/numpy-vector-store@5aa96d0bf84352092a915a8d5b1b03ee6969ee4c -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/tvanreenen
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@5aa96d0bf84352092a915a8d5b1b03ee6969ee4c -
Trigger Event:
release
-
Statement type: