Skip to main content

A fast, lightweight, and zero-setup in-memory vector store powered by NumPy

Project description

NumPy Vector Store

A fast, lightweight, zero-setup in-memory vector store powered by NumPy.

  • Tiny local vector search for projects that do not need a vector database
  • Fast exact cosine search using vectorized NumPy operations
  • Simple typed API returning VectorHit(index, value, metadata)
  • Composable filtering by passing prefiltered row indexes with within_rows
  • Portable persistence as trusted local .npz files with vectors + metadata
  • No framework opinions: bring your own embeddings, chunking, async, and metadata model

Why?

This library is purpose-built for small to medium-scale vector search tasks and offers a simple alternative to heavyweight vector databases when you do not need network services, indexing infrastructure, ingestion pipelines, or domain-specific metadata filtering.

When/Where?

Below are benchmark results for cosine similarity search to help you assess its suitability for your use case.

Embedding Type Dimensions ~5ms ~25ms ~100ms ~500ms
Sentence Transformers 384 1K vectors
1.5MB
10K vectors
15MB
100K vectors
147MB
500K vectors
732MB
OpenAI Small 1536 500 vectors
3MB
5K vectors
29MB
25K vectors
147MB
100K vectors
586MB
OpenAI Large 3072 200 vectors
2MB
2.5K vectors
29MB
5K vectors
59MB
25K vectors
293MB

Benchmarks performed on Apple M2 hardware.

Installation

uv add numpy-vector-store

Quick Start

import numpy as np
from numpy_vector_store import VectorStore

store = VectorStore[dict[str, str]](dimensions=3)

store.add(
    vectors=np.array([
        [1.0, 0.0, 0.0],
        [0.0, 1.0, 0.0],
        [0.0, 0.0, 1.0],
    ]),
    metadata=[
        {"title": "x-axis"},
        {"title": "y-axis"},
        {"title": "z-axis"},
    ],
)

hits = store.cosine_search(
    query=np.array([0.9, 0.1, 0.0]),
    top_k=2,
)

for hit in hits:
    print(f"{hit.metadata['title']}: {hit.value:.3f}")

metadata is an opaque row payload returned with hits. It can be a dict, dataclass, string, integer row ID, or any other Python object that fits your application.

Prefiltering

The store does not implement a metadata query language. To filter by metadata, produce row indexes first, then pass them with within_rows.

rows = [
    i
    for i, metadata in enumerate(store.metadata)
    if metadata["title"].startswith("x")
]

hits = store.cosine_search(query, top_k=10, within_rows=rows)

For structured NumPy metadata, use NumPy to produce the row indexes:

metadata_table = np.array(
    [
        ("intro", "A", 2024),
        ("setup", "A", 2023),
        ("guide", "B", 2024),
    ],
    dtype=[("title", "U20"), ("product", "U10"), ("year", "i4")],
)

store = VectorStore[int](dimensions=3)
store.add(vectors, metadata=np.arange(len(metadata_table)))

mask = (metadata_table["product"] == "A") & (metadata_table["year"] >= 2024)
rows = np.flatnonzero(mask)

hits = store.cosine_search(query, within_rows=rows)

for hit in hits:
    row = metadata_table[hit.metadata]
    print(row["title"], hit.value)

Persistence

Pass a file_path and call save() / load() explicitly:

store = VectorStore[dict[str, str]](dimensions=1536, file_path="vectors.npz")
store.add(embeddings, metadata)
store.save()

loaded = VectorStore[dict[str, str]](dimensions=1536, file_path="vectors.npz")
loaded.load()

Context manager usage auto-saves on exit:

with VectorStore[dict[str, str]](dimensions=1536, file_path="vectors.npz") as store:
    store.add(embeddings, metadata)

Persistence uses a minimal NumPy .npz contract with vectors and metadata arrays. Vectors are normalized when added or loaded, and similarity search only normalizes the query vector. Loading validates shape, dimensions, row counts, and zero-norm vectors. It also uses allow_pickle=True for flexible Python metadata payloads, so only load files generated by your own application or another trusted local process. Loading untrusted .npz files is not a supported security model.

Migrating from 0.1

The preferred 0.2 API is add(...) and cosine_search(...).

The 0.1 methods remain temporarily available, but emit DeprecationWarning and will be removed in a future 0.x release:

store.add_vectors(vectors_2d, metadata_array)
results = store.search(query, top_k=3, score_cutoff=0.5)
for index, value, metadata in results:
    ...

Legacy search(...) keeps returning tuples. New cosine_search(...) returns VectorHit objects.

metadata_schema was removed. For vectorized metadata filtering, keep metadata in a sidecar table and pass matching row indexes with within_rows.

Contributing

git clone https://github.com/tvanreenen/numpy-vector-store.git
cd numpy-vector-store
uv sync --frozen --group dev

Before submitting a pull request:

  1. Run uv run ruff check
  2. Run uv run ruff format --check
  3. Run uv run mypy src/
  4. Run uv run pytest

License

MIT License - see LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

numpy_vector_store-0.2.0.tar.gz (46.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

numpy_vector_store-0.2.0-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file numpy_vector_store-0.2.0.tar.gz.

File metadata

  • Download URL: numpy_vector_store-0.2.0.tar.gz
  • Upload date:
  • Size: 46.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for numpy_vector_store-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8de7ff0bb626a0af64874e526006c7c15a864a51e68c2d4bb90d9d19027f5dd9
MD5 691b915e0be9093deed3c3fa997c19e6
BLAKE2b-256 d2ff16561de6425610ca120d10ea5c5dd7a9ea3765c694b71c1cb5aea5f9b6cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for numpy_vector_store-0.2.0.tar.gz:

Publisher: publish-pypi.yml on tvanreenen/numpy-vector-store

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file numpy_vector_store-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for numpy_vector_store-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8b4344a845daa61e81cc9eedd382ecc8dc1f5c4583bed8cdd558d1b8373c14d3
MD5 6cc13af30e269d2cf090c5a6da94fa3c
BLAKE2b-256 8f6d00af69219a5b76c85501565ba29b974be612dca239f360556ca096433c16

See more details on using hashes here.

Provenance

The following attestation bundles were made for numpy_vector_store-0.2.0-py3-none-any.whl:

Publisher: publish-pypi.yml on tvanreenen/numpy-vector-store

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page