A facade over vector databases — one interface, ~15 backends

These details have not been verified by PyPI

Project links

Homepage

Project description

vd

A facade over vector databases — one Pythonic interface, ~15 backends.

vd lets you operate on any vector database and switch between them with a one-word change, while keeping each backend's particular power one escape hatch away. It also helps you choose the right backend and set it up.

import vd

client = vd.connect("memory")          # switch DB = change this one word
col = client.create_collection("docs")
col["a"] = vd.Document(id="a", text="cats", vector=[0.1, 0.9, 0.0])
col["b"] = vd.Document(id="b", text="pizza", vector=[0.9, 0.0, 0.1])

for hit in col.search([0.1, 0.8, 0.0], limit=2):
    print(hit["id"], hit["score"])

Install

pip install vd                 # core (zero heavy deps) + the memory backend
pip install vd[chroma]         # + a specific backend's client
pip install vd[embedded]       # + all embedded backends (chroma, qdrant, faiss, …)
pip install vd[all-backends]   # + every backend client

The core is near-zero-dependency. Each backend's client library is an optional extra named after the backend.

The mental model

vd stores and searches vectors. Turning text into vectors — embedding — is deliberately external: vd never embeds on its own. This keeps the facade honest (most vector DBs do not embed for you) and lightweight.

Vector-first. You hold the embedding model. Hand vd Documents that already carry a vector; search with a pre-computed query vector.
Text convenience. Pass an embedder (text -> vector) to connect, and then raw text works: col["k"] = "some text", col.search("a query").

With no embedder, passing text raises EmbeddingRequiredError — loud, never a silent wrong-model embedding.

client = vd.connect("chroma", persist_directory="./db", embedder=my_embed_fn)
col = client.create_collection("docs")
col["a"] = "cats and kittens"                  # embedded for you
hits = list(col.search("pets", limit=5))       # query embedded for you

Choosing a backend

vd ships a provider registry distilled from a practitioner report (misc/docs/11 -- VectorDB Selection & Setup Guide ...md) and a recommender:

vd.print_recommendation(
    corpus_size="medium", persistence=True, can_run_docker=True,
    cloud_ok=True, budget="free", needs_hybrid=False,
)
vd.print_backends_table()                       # the whole landscape
vd.compare_backends(["chroma", "qdrant", "pgvector"])

Setting a backend up

vd.check_requirements("qdrant")    # diagnoses readiness, prints the next step
vd.setup_guide("qdrant")           # full pip / docker / env-var playbook
vd.install_backend("qdrant")       # the pip command (run=True to install)

check_requirements is deployment-aware: it checks the pip package for embedded backends, whether a server answers for self-hosted ones, and the required environment variables for managed ones — always ending with one concrete next action.

The API

Object	Is a	Plus
`Client` (from `connect`)	`Mapping[str, Collection]`	`create_collection`, `get_collection`, `delete_collection`, `get_or_create_collection`
`Collection`	`MutableMapping[str, Document]`	`search(...)`
`Document`	dataclass	`id`, `text`, `vector`, `metadata`

col["k"] = vd.Document(id="k", text="…", vector=[...], metadata={"y": 2024})
doc      = col["k"]            # get
del col["k"]                  # delete
"k" in col, len(col), list(col)

col.search(query, *, limit=10, filter=None, egress=None, **backend_kwargs)

search yields dicts {"id", "text", "score", "metadata"} (score is higher-is-better). Transform results with an egress: vd.id_only, vd.id_and_score, vd.text_only, vd.id_text_score, or your own.

Metadata filtering

One backend-agnostic, MongoDB-style filter language — $eq $ne $gt $gte $lt $lte $in $nin $exists $and $or $not:

col.search(qvec, filter={"year": {"$gte": 2020}, "kind": {"$in": ["news", "blog"]}})

Each backend declares which operators it honors natively; an unsupported one raises UnsupportedFilterError rather than silently mis-filtering. Backends with rich native filtering (Qdrant, Pinecone, MongoDB) translate the filter; the rest apply it client-side with the same semantics.

Escape hatches

The facade never traps you. client.client is the raw backend client; collection.native is the raw backend collection — both supported, documented API for reaching backend-specific features.

Backends

Archetype	Backends
Embedded (pip-only)	`memory`, `chroma`, `lancedb`, `sqlite_vec`, `duckdb`, `faiss`
Server (also embedded)	`qdrant`, `weaviate`, `milvus`
Server	`redis`, `elasticsearch`, `pgvector`
Managed	`pinecone`, `mongodb` (Atlas), `turbopuffer`

vd.list_backends() shows what is installed and ready now.

The toolkit

Beyond the facade, vd bundles the composite operations people actually do:

vd.search — multi_query_search, reciprocal_rank_fusion, search_similar_to_document, deduplicate_results.
vd.io — export_collection / import_collection (JSONL, JSON, directory).
vd.migration — migrate_collection, migrate_client, copy_collection — move data between any two backends.
vd.analytics — collection_stats, find_duplicates, find_outliers, validate_collection.
vd.health — health_check_backend, benchmark_search.
vd.text — convenience text cleaning / chunking.
vd.TimeIndexedCollection — a time-windowed wrapper over any collection.
CLI — vd backends, vd install, vd export/import, vd migrate, …

AI-agent skills

vd ships skills (vd/data/skills/) so coding agents can drive it well: vd-quickstart, vd-backend-choose (choosing and setup), vd-ingest, vd-search, vd-ops.

Design

Embedding is external. The core operates on vectors; an embedder is an injected, optional convenience — never a hard dependency.
Two mappings. A Client is a Mapping of collections; a Collection is a MutableMapping of documents plus search. Idiomatic, minimal, familiar.
Thin adapters. AbstractClient / AbstractCollection implement everything users see; a backend supplies a handful of raw primitives. Adding a backend is ~150 lines — see the vd-add-backend skill.
Capabilities, not a fat base. Optional features (SupportsBatch, SupportsHybrid) are @runtime_checkable protocols you feature-discover.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.8

May 28, 2026

0.2.7

May 27, 2026

0.2.6

May 27, 2026

0.2.5

May 27, 2026

0.2.4

May 24, 2026

0.2.3

May 24, 2026

This version

0.2.2

May 22, 2026

0.2.1

May 21, 2026

0.1.6

May 20, 2026

0.1.5

May 16, 2026

0.1.4

May 14, 2026

0.1.3

Apr 27, 2026

0.1.2

Apr 27, 2026

0.1.1

Apr 27, 2026

0.0.11

Aug 22, 2025

0.0.10

Jul 9, 2025

0.0.9

Jul 1, 2025

0.0.8

Jul 1, 2025

0.0.7

Jun 25, 2025

0.0.6

May 17, 2025

0.0.4

Oct 10, 2022

0.0.3

Oct 4, 2022

0.0.2

Jan 6, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vd-0.2.2.tar.gz (198.9 kB view details)

Uploaded May 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vd-0.2.2-py3-none-any.whl (141.5 kB view details)

Uploaded May 22, 2026 Python 3

File details

Details for the file vd-0.2.2.tar.gz.

File metadata

Download URL: vd-0.2.2.tar.gz
Upload date: May 22, 2026
Size: 198.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vd-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`37ec19bbb37c8116435db7c2b0239e5d1b5cb582dffdb0811b978fb88b88752e`
MD5	`5fe52389c918a135f519849314fd25c4`
BLAKE2b-256	`1c53568e40d90fcdf3ffc6c8e5441b25a454aa56cf7b7fbcfd9a84be93a225c3`

See more details on using hashes here.

File details

Details for the file vd-0.2.2-py3-none-any.whl.

File metadata

Download URL: vd-0.2.2-py3-none-any.whl
Upload date: May 22, 2026
Size: 141.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.16 {"installer":{"name":"uv","version":"0.11.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for vd-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`09d820d905e818fd2cc3bff3afdc41dc8512cd365292ea82c388166a0d520773`
MD5	`b720332733154fb2c65705bc31ae6130`
BLAKE2b-256	`f613482fc3e5505605ccdf1848497b0b6f51c375e79775fdd0b367279b02fe2e`

See more details on using hashes here.

vd 0.2.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

vd

Install

The mental model

Choosing a backend

Setting a backend up

The API

Metadata filtering

Escape hatches

Backends

The toolkit

AI-agent skills

Design

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes