Skip to main content

Python client for GVDB distributed vector database

Project description

gvdb

Python client for GVDB distributed vector database.

Install

pip install gvdb

# With bulk import extras (Parquet, NumPy, Pandas, progress bar)
pip install gvdb[import]

# All optional dependencies
pip install gvdb[import-all]

Quick Start

from gvdb import GVDBClient

client = GVDBClient("localhost:50051", api_key="your-key")  # api_key is optional

# Create a collection
client.create_collection("my_vectors", dimension=768)

# Insert vectors
vectors = [[0.1, 0.2, ...], [0.3, 0.4, ...]]  # list of float lists
ids = [1, 2]
client.insert("my_vectors", ids, vectors)

# Search
results = client.search("my_vectors", query_vector=[0.1, 0.2, ...], top_k=10)
for r in results:
    print(f"ID: {r.id}, distance: {r.distance}")

# Hybrid search (BM25 + vector)
results = client.hybrid_search(
    "my_vectors",
    query_vector=[0.1, 0.2, ...],
    text_query="running shoes",
    top_k=10,
    text_field="description",   # metadata field to search
    return_metadata=True,
)

# Clean up
client.drop_collection("my_vectors")
client.close()

Bulk Import

Import vectors from common ML formats. Auto-creates collections, supports resume via upsert idempotency, and shows progress bars (with tqdm).

import numpy as np

# From NumPy array
vectors = np.random.rand(100_000, 768).astype(np.float32)
result = client.import_numpy(vectors, "embeddings")
print(result)  # ImportResult(total=100000, batches=10, elapsed=12.3s, ...)

# From Parquet (GVDB schema: id + vector + metadata columns)
result = client.import_parquet("vectors.parquet", "embeddings")

# From Pandas DataFrame
result = client.import_dataframe(df, "embeddings", vector_column="embedding")

# From CSV (JSON-encoded or dimension-prefixed vector columns)
result = client.import_csv("data.csv", "embeddings")

# From AnnData h5ad (scRNA-seq embeddings)
result = client.import_h5ad("adata.h5ad", "cells", embedding_key="X_pca")

All importers accept mode="upsert" (default, idempotent) or mode="stream_insert" (faster, no resume). See ImportResult for batch counts, timing, and failure tracking.

Optional dependency extras

Extra Dependencies For
gvdb[parquet] pyarrow import_parquet
gvdb[numpy] numpy import_numpy
gvdb[pandas] pandas, pyarrow import_dataframe, import_csv
gvdb[h5ad] anndata, numpy import_h5ad
gvdb[progress] tqdm Progress bars
gvdb[import] All above except anndata Common ML workflows
gvdb[import-all] Everything + polars All formats

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gvdb-0.26.1.tar.gz (120.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gvdb-0.26.1-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file gvdb-0.26.1.tar.gz.

File metadata

  • Download URL: gvdb-0.26.1.tar.gz
  • Upload date:
  • Size: 120.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.26.1.tar.gz
Algorithm Hash digest
SHA256 f421b18e8ccd6578b985b47c3c070cec2d652c7fead83c6b5eed1e5656d914ba
MD5 49ad5252a8637cbc8088139ffe2e9cc4
BLAKE2b-256 15baeec099a9f0676e739fabc14e461bb4cfef91dac449136a6de1e7394d2a64

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.26.1.tar.gz:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file gvdb-0.26.1-py3-none-any.whl.

File metadata

  • Download URL: gvdb-0.26.1-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for gvdb-0.26.1-py3-none-any.whl
Algorithm Hash digest
SHA256 764a7a524881dbb09ac9c43546b02c92483013165aa57835cbb51a864c9f648d
MD5 494b314a4366b98fcd0d6eb63634f9e4
BLAKE2b-256 ab0766293393e68292c85776ca5566bed985cf9b50839a78c8e3e1862ea0c4f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for gvdb-0.26.1-py3-none-any.whl:

Publisher: release-please.yml on JonathanBerhe/gvdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page