Skip to main content

High-performance vector database with a C++ core and Python bindings

Project description

Vector-Vault-DB

A high-performance vector database written from scratch. The engine is C++17; the public interface is a Python extension built with pybind11. There is no dependency on an existing vector store — the index, distance kernels, allocator, and on-disk format are all implemented directly.

Features

  • Approximate nearest-neighbor search over float32 vectors using either an HNSW graph or an IVF (inverted-file) index.
  • SIMD-accelerated distance kernels (AVX-512) for Euclidean, cosine, and dot product, with a scalar fallback selected once at startup via CPU detection.
  • Custom arena allocator that hands out 64-byte aligned blocks for vector storage and tracks per-collection usage.
  • Memory-mapped persistence: a versioned, CRC-checked binary snapshot format. Saves are atomic (temp file + fsync + rename); loads map the vector region instead of copying it onto the heap.
  • Python bindings with NumPy support and an exception hierarchy that mirrors the engine's error categories.

Documentation

Full documentation lives in docs/:

Architecture

Python (vectorvault)            pybind11 extension, NumPy interop, exceptions
        │
        ▼
Engine                          collection registry, shared allocator + persistence
        │
        ▼
Collection                      record store, validation, query orchestration
   ├── Index            HNSW / IVF — graph + inverted-file ANN
   ├── DistanceCalculator   AVX-512 kernels with scalar fallback
   ├── MemoryAllocator      64-byte aligned arena, per-collection accounting
   └── PersistenceManager   atomic save, mmap-backed load, CRC validation

A collection is guarded by a single readers-writer lock: reads (get, query) take a shared lock, mutations (insert, delete, build, save) take an exclusive lock, so index membership stays consistent with the record store.

Requirements

  • A C++17 compiler (GCC, Clang, or MSVC)
  • CMake >= 3.20
  • Python >= 3.9 with NumPy (for the bindings)
  • pybind11 (resolved by the build backend)

Catch2, RapidCheck, and Hypothesis are fetched automatically for the test builds.

Install

pip install -e ".[test]"

This builds the native extension through scikit-build-core and installs the vectorvault package in editable mode.

Quickstart

import vectorvault as vv

engine = vv.Engine()
coll = engine.create_collection("documents", dim=128, metric="cosine")

coll.insert("doc-1", embedding_a, metadata={"title": "intro"})
coll.insert("doc-2", embedding_b)

coll.build_index("hnsw", m=16, ef_construction=200)

results = coll.query(query_vector, k=10, ef_search=64)
for record_id, distance in results:
    print(record_id, distance)

engine.save("documents", "documents.vv")
restored = vv.Engine().load("documents.vv")

Metrics: "euclidean" ("l2"), "cosine", "dot" ("dot_product"). Index types: "hnsw", "ivf".

Errors surface as typed exceptions under vectorvault.VectorVaultError; several also derive from a builtin (for example NotFoundError is a KeyError and SnapshotNotFoundError is a FileNotFoundError).

On-disk format

A snapshot is a single file: a fixed header (magic, version, dimensionality, metric, index type, region offsets, CRC-64 of the content), the collection name, a record directory sorted by id with optional metadata, a 64-byte aligned vector region, and the serialized index. Records, metadata keys, and index nodes are written in a canonical order, so saving a collection, loading it, and saving again produces a byte-identical file. Loads validate existence, magic, version, and checksum before mapping the vector region.

Testing

C++ (Catch2 + RapidCheck):

cmake -S . -B build -DVECTORVAULT_BUILD_TESTS=ON -DVECTORVAULT_BUILD_PYTHON=OFF
cmake --build build
ctest --test-dir build --output-on-failure

Python (pytest + Hypothesis):

pytest

Correctness-critical behavior — distance accuracy against a reference, allocator accounting, query ordering, index membership, and the snapshot round-trip — is covered by property-based tests. A recall benchmark builds an index over 10,000 vectors and asserts mean recall@10 against an exact baseline.

Layout

include/vectorvault/   Public C++ headers
src/core/              Core engine implementation
src/python/            pybind11 module + vectorvault package
tests/cpp/             C++ tests (Catch2 + RapidCheck)
tests/python/          Python tests (pytest + Hypothesis)

Notes and limitations

  • The HNSW index selects neighbours with the diversity heuristic from Malkov & Yashunin (SELECT-NEIGHBORS-HEURISTIC), applied both when linking a new node and when pruning an existing node whose adjacency list overflows. On uniform random data this measures mean recall@10 of ~0.96 at dim 32 and ~0.85 at dim 64 (m=16, ef_construction=200, ef_search=50); recall still tapers on harder, higher-dimensional distributions.
  • The snapshot reader assumes a little-endian host (it reads float components directly from the mapping).
  • AVX-512 is detected at runtime; on hosts without it the scalar kernels are used, which are also the accuracy reference.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vector_vault_db-0.1.1.tar.gz (99.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

vector_vault_db-0.1.1-cp312-cp312-win_amd64.whl (226.5 kB view details)

Uploaded CPython 3.12Windows x86-64

vector_vault_db-0.1.1-cp312-cp312-win32.whl (182.0 kB view details)

Uploaded CPython 3.12Windows x86

vector_vault_db-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (389.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

vector_vault_db-0.1.1-cp312-cp312-macosx_11_0_arm64.whl (187.9 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

vector_vault_db-0.1.1-cp311-cp311-win_amd64.whl (224.6 kB view details)

Uploaded CPython 3.11Windows x86-64

vector_vault_db-0.1.1-cp311-cp311-win32.whl (181.8 kB view details)

Uploaded CPython 3.11Windows x86

vector_vault_db-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (392.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

vector_vault_db-0.1.1-cp311-cp311-macosx_11_0_arm64.whl (187.8 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

vector_vault_db-0.1.1-cp310-cp310-win_amd64.whl (222.3 kB view details)

Uploaded CPython 3.10Windows x86-64

vector_vault_db-0.1.1-cp310-cp310-win32.whl (180.6 kB view details)

Uploaded CPython 3.10Windows x86

vector_vault_db-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (391.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

vector_vault_db-0.1.1-cp310-cp310-macosx_11_0_arm64.whl (186.8 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

vector_vault_db-0.1.1-cp39-cp39-win_amd64.whl (222.5 kB view details)

Uploaded CPython 3.9Windows x86-64

vector_vault_db-0.1.1-cp39-cp39-win32.whl (180.8 kB view details)

Uploaded CPython 3.9Windows x86

vector_vault_db-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (391.2 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

vector_vault_db-0.1.1-cp39-cp39-macosx_11_0_arm64.whl (186.8 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file vector_vault_db-0.1.1.tar.gz.

File metadata

  • Download URL: vector_vault_db-0.1.1.tar.gz
  • Upload date:
  • Size: 99.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vector_vault_db-0.1.1.tar.gz
Algorithm Hash digest
SHA256 85c6910a81f50ca06d6929f10da55ceff3d19f5d9e6148b52ff4e1e8b81d64dd
MD5 1f13d6f66bb42000edd555af56eeb8af
BLAKE2b-256 3387867a4a1bab4a0b4fe6ddc17516036d2b80c194dd69ce8385cb6e9b7b13df

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1.tar.gz:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 4c42ce1991d9d7beb5211976967e0ee046c0482e0955648107a61dd932fdd8e4
MD5 ca59d395ff783d84c5458731fa01ecfd
BLAKE2b-256 b97ecb534465d567930972f7ef8af5bd521d95341f8edb8f677818fa83240937

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp312-cp312-win_amd64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp312-cp312-win32.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 8ede1a7878479d6d4b88cdfbf9762370f51bb110f8aa4441580f4a096fd76b2e
MD5 64ea8949bb0d9e8dfaf639e49463af59
BLAKE2b-256 32e372b9ae3db68645e4b1aed2a2e032d4af8c12152283da7b25b63048cda08f

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp312-cp312-win32.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9998843ee9c48fad156f152b6280f032cc433f1f3073bee69b9faf6fbe65ee7d
MD5 78cbc1ad0014fa754f8834f54d93b1ea
BLAKE2b-256 2202af0df16d81c867ec6be23f556af15a43727c12c727c8d77723e7e14b9c3d

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2d2f86b210bb346196acd37c89dff0b4ffca9147c2a28093de1ea51b83f27657
MD5 87d5f02db5481e575cdea3c0d2933f59
BLAKE2b-256 7fcb477a02518291962a6b6f2179679457439c8d2b32c77c7a4009bdeed0dc18

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 db57d60e08a0f8b02594d9cf5ab1783592d5b2157c99e3b6d1f397993e78d64f
MD5 52581b6757efe91aaec2ee8bfcd2fcbc
BLAKE2b-256 e5d256da8c686cedc322206c78696441ecdc2d77f5c2d2d9672702723e8e65af

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp311-cp311-win_amd64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp311-cp311-win32.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 8659cbd54a67e9aa6041e541df8350017e231d6b9e648aad36f8648ad29b2120
MD5 61ff2b5c5309c4b9e0db61d275b59ec6
BLAKE2b-256 af92a7c4b0e3ae3784a33088f9b29893a1b9a3a6ace5ac2b5a011957397c21f3

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp311-cp311-win32.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4f2c811614c722530e3ea3b1a8ae535fb1b9cbc4dceddd8cae11b17081918dca
MD5 7506a733ae3473a024c06bc7dc236993
BLAKE2b-256 50550e163b451b4938a9807a34b40bc168182240b1a52dd80d6a2ad7c1deb9af

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0e27dfaf7e957102171b0c83abdd61c27e1831bea66f8c64706580b85449c18a
MD5 0d0fc5be02143ed88869316e2546219a
BLAKE2b-256 0a03966652e0c6b1042aaee88a7898594f83b2f14bc1a5248d2b51cf6b4b2fed

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 818c64b489cba20a6105574a1a62d7b6376912ea5f7a96c9a0d03161626c0f01
MD5 6c1c5d29295fb628cafaabc7248db143
BLAKE2b-256 f16576ebc74b7841ae1a29ec208e31d5af223a1cf5537a7369013ad63f463f9c

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp310-cp310-win_amd64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp310-cp310-win32.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 c2e4c5223470658d5e35ba8e7d2a396f2a4c63cb183c5ac3c0729de4e96764d7
MD5 ec97c60afa356e152d7d6c7a95161b3a
BLAKE2b-256 462c252dd7e77eb710931bd51076e209911006c9c71d9351d90b5d51cfee7a17

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp310-cp310-win32.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 62d4ea091d73322b2204bfa8b8669ebab3aa4e8df9865634a5606fc9f03d370d
MD5 26d80a9491e4555489ee8110cb768d0e
BLAKE2b-256 cf7e0b526065b6c388ceb03d46900e4c830dbaba08abc31b3e53a48e22ed0bb0

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0fef8b43ca4a73c5d3eb7526fd1976f1c9bea10f10d8b85efc14e5040251b33c
MD5 a8ac4bc4b3185d57a0bbe696e1e6c58c
BLAKE2b-256 f85a958aa7e866f5d910bc69b343d1043e1e350b0edf09c10ca4f69aaabd7d92

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 b32c1f32a21529a9f8cf0f6a79da561c92555bd94bbc327a6b773078daf96957
MD5 ea447a4fe573757bc1b58c3842791bc8
BLAKE2b-256 ddb110cf297098bc8fd9a93366ffb6fc01eee6f86983f526fce5961c23720fe4

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp39-cp39-win_amd64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp39-cp39-win32.whl.

File metadata

  • Download URL: vector_vault_db-0.1.1-cp39-cp39-win32.whl
  • Upload date:
  • Size: 180.8 kB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for vector_vault_db-0.1.1-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 096461af47292e7f70746a150d4ea49a45e56879d0062dddd9a441fb7f55394f
MD5 e71815a85de2bf512d188853c705c76d
BLAKE2b-256 7f48deadf8889bd050923f5a11892e7137c69be3a719e80eaf3c2dd1909bb805

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp39-cp39-win32.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 139e7250fbacc0adbc04cb013e251c54873a4f6c1fedb9848ca941cb81534d9c
MD5 c5e1d85000b11510473dd00a8823f1ef
BLAKE2b-256 9834c1ad43aa9f2293ba2dcee4f61359b7a975363bf1eff72a391e7e642b93d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vector_vault_db-0.1.1-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vector_vault_db-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0915a07abefcc25e7cbd1016b24aff0568b43bf9270c9da97b4b312e22bfdcc8
MD5 1238b7a7ae78a7e3f06e23af7a9b43f0
BLAKE2b-256 59b269580a578232049604976b769c79fda240d66c274242e7cf039820e5a6c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for vector_vault_db-0.1.1-cp39-cp39-macosx_11_0_arm64.whl:

Publisher: wheels.yml on shahriar-ahmed-seam/Vector-Vault-DB

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page