Skip to main content

Python client for GVDB distributed vector database

Project description

gvdb

Python client for GVDB distributed vector database.

Install

pip install gvdb

# With bulk import extras (Parquet, NumPy, Pandas, progress bar)
pip install gvdb[import]

# All optional dependencies
pip install gvdb[import-all]

Quick Start

from gvdb import GVDBClient

client = GVDBClient("localhost:50051", api_key="your-key")  # api_key is optional

# Create a collection
client.create_collection("my_vectors", dimension=768)

# Insert vectors
vectors = [[0.1, 0.2, ...], [0.3, 0.4, ...]]  # list of float lists
ids = [1, 2]
client.insert("my_vectors", ids, vectors)

# Search
results = client.search("my_vectors", query_vector=[0.1, 0.2, ...], top_k=10)
for r in results:
    print(f"ID: {r.id}, distance: {r.distance}")

# Hybrid search (BM25 + vector)
results = client.hybrid_search(
    "my_vectors",
    query_vector=[0.1, 0.2, ...],
    text_query="running shoes",
    top_k=10,
    text_field="description",   # metadata field to search
    return_metadata=True,
)

# Clean up
client.drop_collection("my_vectors")
client.close()

Bulk Import

Import vectors from common ML formats. Auto-creates collections, supports resume via upsert idempotency, and shows progress bars (with tqdm).

import numpy as np

# From NumPy array
vectors = np.random.rand(100_000, 768).astype(np.float32)
result = client.import_numpy(vectors, "embeddings")
print(result)  # ImportResult(total=100000, batches=10, elapsed=12.3s, ...)

# From Parquet (GVDB schema: id + vector + metadata columns)
result = client.import_parquet("vectors.parquet", "embeddings")

# From Pandas DataFrame
result = client.import_dataframe(df, "embeddings", vector_column="embedding")

# From CSV (JSON-encoded or dimension-prefixed vector columns)
result = client.import_csv("data.csv", "embeddings")

# From AnnData h5ad (scRNA-seq embeddings)
result = client.import_h5ad("adata.h5ad", "cells", embedding_key="X_pca")

All importers accept mode="upsert" (default, idempotent) or mode="stream_insert" (faster, no resume). See ImportResult for batch counts, timing, and failure tracking.

Optional dependency extras

Extra Dependencies For
gvdb[parquet] pyarrow import_parquet
gvdb[numpy] numpy import_numpy
gvdb[pandas] pandas, pyarrow import_dataframe, import_csv
gvdb[h5ad] anndata, numpy import_h5ad
gvdb[progress] tqdm Progress bars
gvdb[import] All above except anndata Common ML workflows
gvdb[import-all] Everything + polars All formats

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gvdb-0.31.0.tar.gz (120.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gvdb-0.31.0-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file gvdb-0.31.0.tar.gz.

File metadata

  • Download URL: gvdb-0.31.0.tar.gz
  • Upload date:
  • Size: 120.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.31.0.tar.gz
Algorithm Hash digest
SHA256 22c9a8bb861bb73eecf58fdde700a7c9f4d06521bd6f6e894074a780c0dd2579
MD5 29757e40ba3663c71b3fe2b75c51a4e3
BLAKE2b-256 79f594b1b71a23ccafa944b0e6780b078716d961a29a085b4b40c7c2e02ba3c2

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.31.0.tar.gz:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gvdb-0.31.0-py3-none-any.whl.

File metadata

  • Download URL: gvdb-0.31.0-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.31.0-py3-none-any.whl
Algorithm Hash digest
SHA256 19dd8187bdbc09a766b6d789672fea2d4c7c7cee65b27609f43a29055a660502
MD5 3f173e9a38cea68bddfa5c8672f8e612
BLAKE2b-256 bb4c27bd6a9e6d79373aa4514adbb66f5908fa11501a2698f90b75d19fe387f2

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.31.0-py3-none-any.whl:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page