Skip to main content

Python client for GVDB distributed vector database

Project description

gvdb

Python client for GVDB distributed vector database.

Install

pip install gvdb

# With bulk import extras (Parquet, NumPy, Pandas, progress bar)
pip install gvdb[import]

# All optional dependencies
pip install gvdb[import-all]

Quick Start

from gvdb import GVDBClient

client = GVDBClient("localhost:50051", api_key="your-key")  # api_key is optional

# Create a collection
client.create_collection("my_vectors", dimension=768)

# Insert vectors
vectors = [[0.1, 0.2, ...], [0.3, 0.4, ...]]  # list of float lists
ids = [1, 2]
client.insert("my_vectors", ids, vectors)

# Search
results = client.search("my_vectors", query_vector=[0.1, 0.2, ...], top_k=10)
for r in results:
    print(f"ID: {r.id}, distance: {r.distance}")

# Hybrid search (BM25 + vector)
results = client.hybrid_search(
    "my_vectors",
    query_vector=[0.1, 0.2, ...],
    text_query="running shoes",
    top_k=10,
    text_field="description",   # metadata field to search
    return_metadata=True,
)

# Clean up
client.drop_collection("my_vectors")
client.close()

Bulk Import

Import vectors from common ML formats. Auto-creates collections, supports resume via upsert idempotency, and shows progress bars (with tqdm).

import numpy as np

# From NumPy array
vectors = np.random.rand(100_000, 768).astype(np.float32)
result = client.import_numpy(vectors, "embeddings")
print(result)  # ImportResult(total=100000, batches=10, elapsed=12.3s, ...)

# From Parquet (GVDB schema: id + vector + metadata columns)
result = client.import_parquet("vectors.parquet", "embeddings")

# From Pandas DataFrame
result = client.import_dataframe(df, "embeddings", vector_column="embedding")

# From CSV (JSON-encoded or dimension-prefixed vector columns)
result = client.import_csv("data.csv", "embeddings")

# From AnnData h5ad (scRNA-seq embeddings)
result = client.import_h5ad("adata.h5ad", "cells", embedding_key="X_pca")

All importers accept mode="upsert" (default, idempotent) or mode="stream_insert" (faster, no resume). See ImportResult for batch counts, timing, and failure tracking.

Optional dependency extras

Extra Dependencies For
gvdb[parquet] pyarrow import_parquet
gvdb[numpy] numpy import_numpy
gvdb[pandas] pandas, pyarrow import_dataframe, import_csv
gvdb[h5ad] anndata, numpy import_h5ad
gvdb[progress] tqdm Progress bars
gvdb[import] All above except anndata Common ML workflows
gvdb[import-all] Everything + polars All formats

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gvdb-0.32.1.tar.gz (120.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gvdb-0.32.1-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file gvdb-0.32.1.tar.gz.

File metadata

  • Download URL: gvdb-0.32.1.tar.gz
  • Upload date:
  • Size: 120.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.32.1.tar.gz
Algorithm Hash digest
SHA256 7ebd9c3a9b178257038698eb4ae86e1f3d7ea3da3e84f0726d31c01dea99da7c
MD5 5a7baf831a9f8e3eefc142f913cab9ee
BLAKE2b-256 18381b1237274a6fae46ac11f0ea262d272fd545adb1a402a9688760d4581eb6

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.32.1.tar.gz:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gvdb-0.32.1-py3-none-any.whl.

File metadata

  • Download URL: gvdb-0.32.1-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.32.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d196405c81ee71db200ad62614ff44bc94f561a029babe44cc400592c7400c73
MD5 a8e7f07c8bd9d57bda40fe04975b37ba
BLAKE2b-256 e6092a3b09f83b3a8752f0e0af39020041b5b5006b36933c2f0d1672e7dfc242

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.32.1-py3-none-any.whl:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page