Embedded vector store for local-first AI applications.

These details have not been verified by PyPI

Project links

Development Status
- 3 - Alpha
License
- OSI Approved :: MIT License
Programming Language

Project description

vectlite

Embedded vector store for local-first AI applications.

vectlite is a single-file, zero-dependency vector database written in Rust with Python bindings. It gives you dense + sparse hybrid search, HNSW indexing, metadata filtering, transactions, and crash-safe persistence in a single .vdb file -- no server, no Docker, no network calls.

Installation

pip install vectlite

Requires Python 3.9+. Pre-built wheels are available for macOS (x86_64, arm64), Linux (x86_64, aarch64), and Windows (x86_64).

Quick Start

import vectlite

# Create or open a database
db = vectlite.open("knowledge.vdb", dimension=384)

# Insert records with vectors, metadata, and sparse terms
db.upsert("doc1", embedding, {"source": "blog", "title": "Auth Guide"})
db.upsert("doc2", embedding2, {"source": "notes", "title": "Billing"})

# Search with filters
results = db.search(embedding_query, k=5, filter={"source": "blog"})

# Clean up
db.compact()

Features

Core

Single-file storage -- one .vdb file per database, portable and easy to back up
Dense vectors -- cosine similarity with automatic HNSW indexing for large collections
Sparse vectors -- BM25-scored inverted index for keyword retrieval
Hybrid search -- dense + sparse fusion with linear or RRF strategies
Rich metadata -- str, int, float, bool, None, list, dict values
Crash-safe WAL -- writes land in a write-ahead log first, then checkpoint with compact()
Transactions -- atomic batched writes with db.transaction()
File locking -- advisory locks prevent corruption from concurrent access

Search & Retrieval

Metadata filters -- MongoDB-style operators: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, $contains, $exists, $and, $or, $not
Nested filters -- dot-path traversal (author.name), $elemMatch, $size on lists and dicts
Named vectors -- multiple vector spaces per record (vectors={"title": [...], "body": [...]})
Multi-vector queries -- weighted search across vector spaces in a single call
MMR diversification -- mmr_lambda controls relevance vs. diversity trade-off
Namespaces -- logical isolation with per-namespace or cross-namespace search
Rerankers -- built-in text_match(), metadata_boost(), cross_encoder(), bi_encoder(), composable with compose()
Observability -- search_with_stats() returns timings, BM25 term scores, ANN stats, and per-result explain payloads

Data Management

Physical collections -- vectlite.open_store() manages a directory of independent databases
Bulk ingestion -- bulk_ingest() with deferred index rebuilds for fast imports
Snapshots -- db.snapshot(path) creates a self-contained copy
Backup / Restore -- db.backup(dir) and vectlite.restore(dir, path) for full roundtrips
Read-only mode -- vectlite.open(path, read_only=True) for safe concurrent readers
Text analyzers -- configurable tokenizer pipeline with stopwords, stemming, and n-grams

Usage

Hybrid Search with Reranking

import vectlite

db = vectlite.open("knowledge.vdb", dimension=384)

# Upsert with dense + sparse vectors
db.upsert(
    "doc1",
    dense_embedding,
    {"source": "docs", "title": "Auth Setup", "text": "How to configure SSO..."},
    sparse=vectlite.sparse_terms("How to configure SSO authentication"),
)

# Hybrid search with reranking
results = db.search(
    query_embedding,
    k=10,
    sparse=vectlite.sparse_terms("SSO authentication"),
    fusion="rrf",
    filter={"source": "docs"},
    explain=True,
    rerank=vectlite.rerankers.compose(
        vectlite.rerankers.text_match(),
        vectlite.rerankers.metadata_boost("source", {"docs": 0.5}),
    ),
)

for result in results:
    print(result["id"], result["score"])

Bulk Ingestion (Recommended for Large Imports)

For ingesting more than a few hundred records, use bulk_ingest() instead of calling upsert() in a loop. It writes records in WAL batches and rebuilds indexes only once at the end, making it orders of magnitude faster.

records = [
    {
        "id": f"doc{i}",
        "vector": embeddings[i],
        "metadata": {"source": "corpus", "chunk": i},
        "sparse": vectlite.sparse_terms(texts[i]),  # optional
    }
    for i in range(len(texts))
]

count = db.bulk_ingest(records, batch_size=5000)
print(f"Ingested {count} records")

The records parameter is a list[dict] where each dict has keys:

id (str, required) -- unique record identifier
vector (list[float], required) -- dense embedding vector
metadata (dict, optional) -- arbitrary metadata
sparse (dict[str, float], optional) -- sparse terms from sparse_terms()
vectors (dict[str, list[float]], optional) -- named vectors
namespace (str, optional) -- namespace override per record

upsert_many() and insert_many() also accept the same list[dict] format and rebuild indexes once, but don't batch WAL writes internally.

Collections

store = vectlite.open_store("./my_collections")
products = store.create_collection("products", dimension=384)
products.upsert("p1", embedding, {"name": "Widget", "price": 9.99})

logs = store.open_or_create_collection("logs", dimension=128)
print(store.collections())  # ["logs", "products"]

Transactions

with db.transaction() as tx:
    tx.upsert("doc1", emb1, {"source": "a"})
    tx.upsert("doc2", emb2, {"source": "b"})
    tx.delete("old_doc")
# All operations commit atomically or roll back on exception

Text Helpers

# Handles embedding + sparse term generation for you
vectlite.upsert_text(db, "doc1", "Auth setup guide", embed_fn, {"source": "docs"})
results = vectlite.search_text(db, "how to authenticate", embed_fn, k=5)

Analyzers

analyzer = vectlite.analyzers.Analyzer().lowercase().stopwords("en").stemmer("english")
terms = analyzer.sparse_terms("How to authenticate users with SSO")
# Use with upsert: db.upsert("doc1", emb, meta, sparse=terms)

Snapshots & Backup

db.snapshot("/backups/knowledge_2024.vdb")  # Self-contained copy
db.backup("/backups/full/")                 # Full backup with ANN sidecars

restored = vectlite.restore("/backups/full/", "restored.vdb")

Read-Only Mode

ro = vectlite.open("knowledge.vdb", read_only=True)
results = ro.search(query, k=5)  # Reads work
ro.upsert(...)                    # Raises VectLiteError

Search Diagnostics

outcome = db.search_with_stats(query, k=5, sparse=terms, explain=True)

print(outcome["stats"]["timings"])       # {"dense_us": 120, "sparse_us": 45, ...}
print(outcome["stats"]["used_ann"])      # True
print(outcome["results"][0]["explain"])  # Detailed scoring breakdown

Filter Operators

Operator	Example	Description
`$eq`	`{"field": {"$eq": "value"}}`	Equal (also `{"field": "value"}`)
`$ne`	`{"field": {"$ne": "value"}}`	Not equal
`$gt` / `$gte`	`{"field": {"$gt": 5}}`	Greater than (or equal)
`$lt` / `$lte`	`{"field": {"$lt": 20}}`	Less than (or equal)
`$in` / `$nin`	`{"field": {"$in": ["a", "b"]}}`	In / not in set
`$contains`	`{"field": {"$contains": "auth"}}`	Substring match
`$exists`	`{"field": {"$exists": True}}`	Field presence
`$and` / `$or`	`{"$and": [{...}, {...}]}`	Logical combinators
`$not`	`{"$not": {...}}`	Logical negation
`$elemMatch`	`{"tags": {"$elemMatch": {"$eq": "rust"}}}`	Match list elements
`$size`	`{"tags": {"$size": 3}}`	List length
dot-path	`{"author.name": "Alice"}`	Nested field access

Database Methods Reference

Write Methods

Method	Description
`db.upsert(id, vector, metadata, sparse=..., vectors=...)`	Insert or update a single record
`db.insert(id, vector, metadata, sparse=..., vectors=...)`	Insert a record (raises on duplicate id)
`db.upsert_many(records, namespace=None)`	Upsert a batch of records (single index rebuild)
`db.insert_many(records, namespace=None)`	Insert a batch (raises on duplicate ids)
`db.bulk_ingest(records, namespace=None, batch_size=10000)`	Fastest bulk import with batched WAL writes
`db.delete(id, namespace=None)`	Delete a single record
`db.delete_many(ids, namespace=None)`	Delete multiple records by id

Read Methods

Method	Description
`db.get(id, namespace=None)`	Get a single record by id
`db.search(query, k=10, ...)`	Search and return a list of results
`db.search_with_stats(query, k=10, ...)`	Search with detailed performance stats
`db.count()` or `len(db)`	Number of records in the database
`db.namespaces()`	List all namespaces
`db.dimension`	Vector dimension (property)
`db.path`	Database file path (property)
`db.read_only`	Whether the database is read-only (property)

Maintenance Methods

Method	Description
`db.compact()`	Fold WAL into snapshot and persist ANN indexes
`db.flush()`	Alias for `compact()`
`db.snapshot(dest)`	Create a self-contained `.vdb` copy
`db.backup(dest_dir)`	Full backup including ANN sidecar files
`db.transaction()`	Begin an atomic transaction (use as context manager)

How It Works

Records are stored in a compact binary .vdb snapshot file
Writes go through a crash-safe WAL (.wal) before being applied in memory
compact() folds the WAL into the snapshot and persists HNSW sidecar files
Dense search uses HNSW indexes (auto-built for collections above ~128 records)
Sparse search uses an inverted index with BM25 scoring
Hybrid fusion combines dense + sparse via linear combination or reciprocal rank fusion
Advisory file locks (flock) prevent concurrent write corruption

License

MIT

Project details

These details have not been verified by PyPI

Project links

Development Status
- 3 - Alpha
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

0.9.3

May 13, 2026

0.9.2

May 12, 2026

0.9.1

May 12, 2026

0.9.0

May 12, 2026

0.1.12

May 10, 2026

0.1.11

Apr 1, 2026

This version

0.1.10

Mar 31, 2026

0.1.3

Mar 30, 2026

0.1.2

Mar 30, 2026

0.1.1

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectlite-0.1.10.tar.gz (57.5 kB view details)

Uploaded Mar 31, 2026 Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vectlite-0.1.10-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.8 MB view details)

Uploaded Mar 31, 2026 PyPymanylinux: glibc 2.17+ ARM64

vectlite-0.1.10-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.8 MB view details)

Uploaded Mar 31, 2026 PyPymanylinux: glibc 2.17+ ARM64

vectlite-0.1.10-cp39-abi3-win_amd64.whl (1.5 MB view details)

Uploaded Mar 31, 2026 CPython 3.9+Windows x86-64

vectlite-0.1.10-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.8 MB view details)

Uploaded Mar 31, 2026 CPython 3.9+manylinux: glibc 2.17+ ARM64

vectlite-0.1.10-cp39-abi3-macosx_11_0_arm64.whl (1.6 MB view details)

Uploaded Mar 31, 2026 CPython 3.9+macOS 11.0+ ARM64

vectlite-0.1.10-cp39-abi3-macosx_10_12_x86_64.whl (1.7 MB view details)

Uploaded Mar 31, 2026 CPython 3.9+macOS 10.12+ x86-64

vectlite-0.1.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB view details)

Uploaded Mar 31, 2026 CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file vectlite-0.1.10.tar.gz.

File metadata

Download URL: vectlite-0.1.10.tar.gz
Upload date: Mar 31, 2026
Size: 57.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectlite-0.1.10.tar.gz
Algorithm	Hash digest
SHA256	`d2160c7cff9ae431b3c408b0cf7ac11301109b93011bdacd8e5de50012bebbfe`
MD5	`0052eaf61c222996ed9d7439f92f6bb2`
BLAKE2b-256	`1537ed4ebdc3638e37c400be05a79894ab2ecfac596b2aa763e8c8c5577d8286`

See more details on using hashes here.

File details

Details for the file vectlite-0.1.10-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: vectlite-0.1.10-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: Mar 31, 2026
Size: 1.8 MB
Tags: PyPy, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectlite-0.1.10-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`dc021711a6c809c550767fa2e5bd840a97ef44b8d037deb127672be8b231cec0`
MD5	`19ce9e6a9a6ff95f982e1a3852f16fc1`
BLAKE2b-256	`6dc3bc4bb2e8ea0448bd5833f485cc7dd98757fb61b1e3d7c9b0f62a088d97aa`

See more details on using hashes here.

File details

Details for the file vectlite-0.1.10-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: vectlite-0.1.10-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: Mar 31, 2026
Size: 1.8 MB
Tags: PyPy, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectlite-0.1.10-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`675aa692ef5743bbab6c0349d30b03ffe1c9f9d071b79e42b81c4e3c5f4cf6a1`
MD5	`b2f0aa2e25bbcd6610353f2c0987a115`
BLAKE2b-256	`fd97f7bdc00745a5ca8c79529d70d4a7d4931e26a50a2c6728e56b4be4579fea`

See more details on using hashes here.

File details

Details for the file vectlite-0.1.10-cp39-abi3-win_amd64.whl.

File metadata

Download URL: vectlite-0.1.10-cp39-abi3-win_amd64.whl
Upload date: Mar 31, 2026
Size: 1.5 MB
Tags: CPython 3.9+, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectlite-0.1.10-cp39-abi3-win_amd64.whl
Algorithm	Hash digest
SHA256	`afdffe2f732f6bd37065735234a5155131634f1bb40bf6cadf6d0060124d9edb`
MD5	`715c13001ec61b93eb2a508fc332f96c`
BLAKE2b-256	`29c7499116d9dbec0745e7c4e77c071873d07fa03904d0a3f1208626cd29705e`

See more details on using hashes here.

File details

Details for the file vectlite-0.1.10-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

Download URL: vectlite-0.1.10-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Upload date: Mar 31, 2026
Size: 1.8 MB
Tags: CPython 3.9+, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectlite-0.1.10-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm	Hash digest
SHA256	`307ce555445838093b87a0cdfb625b5859278891a00ebc6c32c555a245d32bcd`
MD5	`ec23170c915ba07f33da3e21097f3ceb`
BLAKE2b-256	`6dc80da5e0b1487bce81b36e32c1e1cd24ef7c86490c18978b104d4a61488374`

See more details on using hashes here.

File details

Details for the file vectlite-0.1.10-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: vectlite-0.1.10-cp39-abi3-macosx_11_0_arm64.whl
Upload date: Mar 31, 2026
Size: 1.6 MB
Tags: CPython 3.9+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectlite-0.1.10-cp39-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`1044566c722ba9dd26c482f7f2401f064c1bad0a915afbfee0829908e704266e`
MD5	`3b8a5c8dbeb404e784688e4fd30a5d23`
BLAKE2b-256	`4672ec2d1fcdbb18fdefe569ef4a4197dc5a509559f82dd9cdc56364b6e78f42`

See more details on using hashes here.

File details

Details for the file vectlite-0.1.10-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

Download URL: vectlite-0.1.10-cp39-abi3-macosx_10_12_x86_64.whl
Upload date: Mar 31, 2026
Size: 1.7 MB
Tags: CPython 3.9+, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectlite-0.1.10-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`a620b84a9d8bd3fdfbe08452e848a6c08de10fa6a71931ac7f3acedbd9cebb88`
MD5	`7be9eda339d8a6acf7249e289cc5237b`
BLAKE2b-256	`61c4b91df33fe0b881080ebbad5a65292522654af78ed0f66947edbd564e7717`

See more details on using hashes here.

File details

Details for the file vectlite-0.1.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: vectlite-0.1.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Mar 31, 2026
Size: 1.8 MB
Tags: CPython 3.8, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vectlite-0.1.10-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`bfe66e082b7236f93b06ad7b970f0c5d952e7b3b5be50a31cd7fca634883a1ff`
MD5	`7163f8c8b5f75118422ba1314cb32c12`
BLAKE2b-256	`fcbfa83fb7352e901b2ba742cef43231449edc508732ac450d78e8299201087d`

See more details on using hashes here.

vectlite 0.1.10

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

vectlite

Installation

Quick Start

Features

Core

Search & Retrieval

Data Management

Usage

Hybrid Search with Reranking

Bulk Ingestion (Recommended for Large Imports)

Collections

Transactions

Text Helpers

Analyzers

Snapshots & Backup

Read-Only Mode

Search Diagnostics

Filter Operators

Database Methods Reference

Write Methods

Read Methods

Maintenance Methods

How It Works

Links

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes