Skip to main content

Python client for GVDB distributed vector database

Project description

gvdb

Python client for GVDB distributed vector database.

Install

pip install gvdb

# With bulk import extras (Parquet, NumPy, Pandas, progress bar)
pip install gvdb[import]

# All optional dependencies
pip install gvdb[import-all]

Quick Start

from gvdb import GVDBClient

client = GVDBClient("localhost:50051", api_key="your-key")  # api_key is optional

# Create a collection
client.create_collection("my_vectors", dimension=768)

# Insert vectors
vectors = [[0.1, 0.2, ...], [0.3, 0.4, ...]]  # list of float lists
ids = [1, 2]
client.insert("my_vectors", ids, vectors)

# Search
results = client.search("my_vectors", query_vector=[0.1, 0.2, ...], top_k=10)
for r in results:
    print(f"ID: {r.id}, distance: {r.distance}")

# Hybrid search (BM25 + vector)
results = client.hybrid_search(
    "my_vectors",
    query_vector=[0.1, 0.2, ...],
    text_query="running shoes",
    top_k=10,
    text_field="description",   # metadata field to search
    return_metadata=True,
)

# Clean up
client.drop_collection("my_vectors")
client.close()

Bulk Import

Import vectors from common ML formats. Auto-creates collections, supports resume via upsert idempotency, and shows progress bars (with tqdm).

import numpy as np

# From NumPy array
vectors = np.random.rand(100_000, 768).astype(np.float32)
result = client.import_numpy(vectors, "embeddings")
print(result)  # ImportResult(total=100000, batches=10, elapsed=12.3s, ...)

# From Parquet (GVDB schema: id + vector + metadata columns)
result = client.import_parquet("vectors.parquet", "embeddings")

# From Pandas DataFrame
result = client.import_dataframe(df, "embeddings", vector_column="embedding")

# From CSV (JSON-encoded or dimension-prefixed vector columns)
result = client.import_csv("data.csv", "embeddings")

# From AnnData h5ad (scRNA-seq embeddings)
result = client.import_h5ad("adata.h5ad", "cells", embedding_key="X_pca")

All importers accept mode="upsert" (default, idempotent) or mode="stream_insert" (faster, no resume). See ImportResult for batch counts, timing, and failure tracking.

Optional dependency extras

Extra Dependencies For
gvdb[parquet] pyarrow import_parquet
gvdb[numpy] numpy import_numpy
gvdb[pandas] pandas, pyarrow import_dataframe, import_csv
gvdb[h5ad] anndata, numpy import_h5ad
gvdb[progress] tqdm Progress bars
gvdb[import] All above except anndata Common ML workflows
gvdb[import-all] Everything + polars All formats

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gvdb-0.29.0.tar.gz (120.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gvdb-0.29.0-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file gvdb-0.29.0.tar.gz.

File metadata

  • Download URL: gvdb-0.29.0.tar.gz
  • Upload date:
  • Size: 120.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.29.0.tar.gz
Algorithm Hash digest
SHA256 d90ca18d08d0cca4574fc2ce959d45af398504934706d4dd7442359fcc2cae53
MD5 111f0917f837f233b5cff4bb866daa34
BLAKE2b-256 dc5e3b777181933760ffe2ee6cbc20378044b614db4117ba596babf33da7702a

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.29.0.tar.gz:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gvdb-0.29.0-py3-none-any.whl.

File metadata

  • Download URL: gvdb-0.29.0-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.29.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2a1a10b318f10e03a8a093191ff3b419be3b55c3f23d943172e83854aafffa3e
MD5 e3a453b4fcd4f20c8750b50b75e3ddd0
BLAKE2b-256 4c2889d2601d91c89dab19d1e8fe7c7d035629c665f9b79a820fa6811767289a

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.29.0-py3-none-any.whl:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page