Skip to main content

Python client for GVDB distributed vector database

Project description

gvdb

Python client for GVDB distributed vector database.

Install

pip install gvdb

# With bulk import extras (Parquet, NumPy, Pandas, progress bar)
pip install gvdb[import]

# All optional dependencies
pip install gvdb[import-all]

Quick Start

from gvdb import GVDBClient

client = GVDBClient("localhost:50051", api_key="your-key")  # api_key is optional

# Create a collection
client.create_collection("my_vectors", dimension=768)

# Insert vectors
vectors = [[0.1, 0.2, ...], [0.3, 0.4, ...]]  # list of float lists
ids = [1, 2]
client.insert("my_vectors", ids, vectors)

# Search
results = client.search("my_vectors", query_vector=[0.1, 0.2, ...], top_k=10)
for r in results:
    print(f"ID: {r.id}, distance: {r.distance}")

# Hybrid search (BM25 + vector)
results = client.hybrid_search(
    "my_vectors",
    query_vector=[0.1, 0.2, ...],
    text_query="running shoes",
    top_k=10,
    text_field="description",   # metadata field to search
    return_metadata=True,
)

# Clean up
client.drop_collection("my_vectors")
client.close()

Bulk Import

Import vectors from common ML formats. Auto-creates collections, supports resume via upsert idempotency, and shows progress bars (with tqdm).

import numpy as np

# From NumPy array
vectors = np.random.rand(100_000, 768).astype(np.float32)
result = client.import_numpy(vectors, "embeddings")
print(result)  # ImportResult(total=100000, batches=10, elapsed=12.3s, ...)

# From Parquet (GVDB schema: id + vector + metadata columns)
result = client.import_parquet("vectors.parquet", "embeddings")

# From Pandas DataFrame
result = client.import_dataframe(df, "embeddings", vector_column="embedding")

# From CSV (JSON-encoded or dimension-prefixed vector columns)
result = client.import_csv("data.csv", "embeddings")

# From AnnData h5ad (scRNA-seq embeddings)
result = client.import_h5ad("adata.h5ad", "cells", embedding_key="X_pca")

All importers accept mode="upsert" (default, idempotent) or mode="stream_insert" (faster, no resume). See ImportResult for batch counts, timing, and failure tracking.

Optional dependency extras

Extra Dependencies For
gvdb[parquet] pyarrow import_parquet
gvdb[numpy] numpy import_numpy
gvdb[pandas] pandas, pyarrow import_dataframe, import_csv
gvdb[h5ad] anndata, numpy import_h5ad
gvdb[progress] tqdm Progress bars
gvdb[import] All above except anndata Common ML workflows
gvdb[import-all] Everything + polars All formats

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gvdb-0.22.0.tar.gz (120.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gvdb-0.22.0-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file gvdb-0.22.0.tar.gz.

File metadata

  • Download URL: gvdb-0.22.0.tar.gz
  • Upload date:
  • Size: 120.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.22.0.tar.gz
Algorithm Hash digest
SHA256 06f618aa65b9f76231d917bf9f1817f93b22fe0b3730ad39c0f6407a68b47dcc
MD5 fe924fbab16956c64cf2fb849f7604cf
BLAKE2b-256 8e7bad1ca1a6fdb9c9ce4a7f0f02fd863bde88d3b5c76248805cd2f0685dfdc7

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.22.0.tar.gz:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gvdb-0.22.0-py3-none-any.whl.

File metadata

  • Download URL: gvdb-0.22.0-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.22.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ca709f0c3fafba4c66e5b1fe88e1862b52f89e2a97916ee9778d9a41807f440
MD5 36bd0d355cbe46727e21e6f5200c6c82
BLAKE2b-256 5e34736879e5270366cd8dd1ed6eeb36a97d1368b68d797e6a3233120053c83a

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.22.0-py3-none-any.whl:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page