Skip to main content

Python Client SDK for CyborgDB: The Confidential Vector Database

Project description

CyborgDB

CyborgDB Python SDK

PyPI - Version PyPI - License PyPI - Python Version

The CyborgDB Python SDK is the Python client for CyborgDB — the vector database that stays encrypted even while it's searching. Run similarity search directly on encrypted data with client-side keys; only the result of a query is ever decrypted, never the index. Built for Python, it fits into existing AI and data workflows.

This SDK talks to cyborgdb-service, which you self-host in your own VPC or on-prem and run alongside your app. Install and start it separately. See our docs for more info.

Key Features

  • Encryption-in-use: Search runs directly on ciphertext; only the query result is decrypted, never the index or stored vectors
  • Encrypted ANN: Disk-backed encrypted DiskIVF index with recall within 2% of a plaintext baseline (read the benchmarks)
  • Filters on encrypted metadata: Combine vector similarity with equality and range predicates in a single request
  • BYOK / HYOK: Wrap per-index keys with AWS KMS or AWS Secrets Manager, or hold the key client-side — you control the key material
  • Per-tenant key isolation: Per-index, per-user keys with cryptographic RBAC; revoke a user and their keys are erased
  • Pythonic API: Familiar client/index interface that integrates with existing Python AI workflows

Getting Started

To get started in minutes, check out our Quickstart Guide.

Install the SDK

  1. Install cyborgdb-service
# Pull the CyborgDB Service image
docker pull cyborginc/cyborgdb-service

# Or install via pip
pip install cyborgdb-service
  1. Install cyborgdb SDK:
# Install the CyborgDB Python SDK
pip install cyborgdb

Index and query vectors

from cyborgdb import Client

client = Client('https://localhost:8000', 'your-service-root-key')  # api_key optional; only if the service was started with one

# Generate a 32-byte encryption key
index_key = client.generate_key()

# Create an encrypted index
index = client.create_index(
    index_name='my-index', 
    index_key=index_key
)

# Add encrypted vector items
items = [
    {
        'id': 'doc1',
        'vector': [0.1] * 128,  # Replace with real embeddings
        'contents': 'Hello world!',
        'metadata': {'category': 'greeting', 'language': 'en'}
    },
    {
        'id': 'doc2',
        'vector': [0.1] * 128,  # Replace with real embeddings
        'contents': 'Bonjour le monde!',
        'metadata': {'category': 'greeting', 'language': 'fr'}
    }
]

index.upsert(items)

# Query the encrypted index
query_vector = [0.2] * 128  # 128 dimensions
results = index.query(query_vectors=query_vector, top_k=5)

# Print the results
for result in results:
    print(f"ID: {result['id']}, Distance: {result['distance']}")
# ID: doc1, Distance: 1.1314
# ID: doc2, Distance: 1.1314

Run batch queries

# Search with multiple query vectors simultaneously
query_vectors = [
    [0.1] * 128,
    [0.2] * 128
]

batch_results = index.query(query_vectors=query_vectors, top_k=5)

# Print the results (batch queries return list of lists)
for i, query_results in enumerate(batch_results):
    print(f"\nResults for query {i}:")
    for result in query_results:
        print(f"  ID: {result['id']}, Distance: {result['distance']}")
# Results for query 0:
#   ID: doc1, Distance: 0.0000
#   ID: doc2, Distance: 0.0000
#
# Results for query 1:
#   ID: doc1, Distance: 1.1314
#   ID: doc2, Distance: 1.1314

Filter results by metadata

# Search with metadata filters
query_vector = [0.1] * 128
results = index.query(
    query_vectors=query_vector,
    top_k=10,
    n_probes=1,
    greedy=False,
    filters={'category': 'greeting', 'language': 'en'},
    include=['distance', 'metadata']
)

# Print the results
for result in results:
    print(f"ID: {result['id']}, Distance: {result['distance']}, Metadata: {result['metadata']}")
# ID: doc1, Distance: 0.0000, Metadata: {'category': 'greeting', 'language': 'en'}

Bring Your Own Key (BYOK) via KMS

When the service is configured with a kms.registry entry, the SDK can delegate key management entirely to the server-side KMS. The service generates the data encryption key, wraps it under the named KMS slot, and persists the envelope — the SDK never sees or holds the key.

# Create a KMS-backed index — no index_key from the SDK side.
# 'vendor-kms-slot' must match an entry in the service's cyborgdb.yaml.
index = client.create_index(
    index_name='kms-backed-index',
    kms_name='vendor-kms-slot',
    dimension=128,
    metric='euclidean',
)

# Reopening the index later doesn't require a key either; the service
# resolves the data key from the index's stored KMS envelope.
loaded = client.load_index(index_name='kms-backed-index')
loaded.upsert(items)

Alternatively, the SDK can supply the key itself — pass index_key and omit kms_name. This is the no-KMS path, which the service records internally as provider: none:

index = client.create_index(
    index_name='sdk-keyed-index',
    index_key=index_key,
    dimension=128,
)

Supply exactly one of index_key / kms_name — passing both is rejected by the service with a 400, since the named slot already determines the key source.

Control access with per-user keys

When the service runs with a root admin key (CYBORGDB_SERVICE_ROOT_KEY) set, RBAC is enabled. The root can mint per-user API keys scoped to a single index, each with a read / write permission set. Permissions are enforced cryptographically: a user's wrapped data-encryption keys are their permission set. A read-only user cannot decrypt for a write operation; revoking a user erases their keys.

# Admin (root) client: mint users on an existing index.
admin = Client(base_url, api_key=SERVICE_ROOT_KEY)
index = admin.load_index(index_name='kms-backed-index')   # KMS-backed (see BYOK)

reader = index.create_user(permissions=['read'])
writer = index.create_user(permissions=['read', 'write'])
# Each returns {'user_id': '<hex>', 'api_key': 'cdbk_...'} — the api_key is
# shown ONCE and never stored by the service. Hand it to the user securely.

index.list_users()                 # [{'user_id': ..., 'permissions': [...]}, ...]
index.delete_user(reader['user_id'])   # revoke; the key stops working immediately

A user authenticates with their cdbk_ key and needs no index key of their own — they load the index by name and the service resolves its key:

user = Client(base_url, api_key=reader['api_key'])
idx = user.load_index(index_name='kms-backed-index')   # no index_key
idx.query(query_vectors=[...], top_k=5)                # allowed for 'read'
idx.upsert(items)                                      # raises ValueError for read-only users

User keys resolve the index key server-side, so they work against KMS-backed indexes. SDK-supplied-key indexes (provider: none) have no server-side key for the service to resolve on a user's behalf. See the service's rbac.md for the full design.

Documentation

For more information on CyborgDB, see the Cyborg Docs.

License

The CyborgDB Python SDK is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cyborgdb-0.17.0.tar.gz (146.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cyborgdb-0.17.0-py3-none-any.whl (113.8 kB view details)

Uploaded Python 3

File details

Details for the file cyborgdb-0.17.0.tar.gz.

File metadata

  • Download URL: cyborgdb-0.17.0.tar.gz
  • Upload date:
  • Size: 146.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cyborgdb-0.17.0.tar.gz
Algorithm Hash digest
SHA256 ec87c471d3ea475f00fc89e6af3fd2bde7b73d49585f1e03ed2c1412268848f7
MD5 a44b2fdf58cdf38cc5b65067d7f3054c
BLAKE2b-256 784c2584258dcef28a023df14fb0ee82a6ea3e556f6ec146c7e6291f70781546

See more details on using hashes here.

Provenance

The following attestation bundles were made for cyborgdb-0.17.0.tar.gz:

Publisher: build_and_package_wheels.yml on cyborginc/cyborgdb-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cyborgdb-0.17.0-py3-none-any.whl.

File metadata

  • Download URL: cyborgdb-0.17.0-py3-none-any.whl
  • Upload date:
  • Size: 113.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cyborgdb-0.17.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cc618f6b3a28704a91f6c438c4cef1e42cf5abd83a718a6daf56d9c3aa4db871
MD5 b18b425e85568ac2602ea458d0f87f63
BLAKE2b-256 fe3a7d03029ce9211aa70b2397595ce48662a09e114180b58c13e61b292d10c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for cyborgdb-0.17.0-py3-none-any.whl:

Publisher: build_and_package_wheels.yml on cyborginc/cyborgdb-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page