Skip to main content

The official Python SDK for the Hydra DB (hydradb.com)

Project description

Hydra DB Python SDK - hydradb.com

The official Python SDK for the Hydra DB platform. Build powerful, context-aware AI applications in your Python applications.

Hydra DB is your plug-and-play memory infrastructure. It powers intelligent, context-aware retrieval for any AI app or agent. Whether you're building a customer support bot, research copilot, or internal knowledge assistant.

Learn more about the SDK from our docs

Core features

  • Dynamic retrieval and querying that always retrieves the most relevant context
  • Built-in long-term memory that evolves with every user interaction
  • Personalization hooks for user preferences, intent, and history
  • Raw embeddings support for bring-your-own vector workflows
  • Developer-first SDK with full type safety and IDE autocompletion

Getting started

Installation

pip install hydra-db-python

Client setup

Both synchronous and asynchronous clients are available. Use AsyncHydraDB for async/await patterns and HydraDB for traditional synchronous workflows. Both expose the exact same set of methods.

import os
from hydra_db import HydraDB, AsyncHydraDB

api_key = os.environ["HYDRA_DB_API_KEY"]

# Sync client
client = HydraDB(token=api_key)

# Async client
async_client = AsyncHydraDB(token=api_key)

Tenant Management

A tenant is a single isolated database. Within it you can create further isolated collections called sub-tenants. Learn more

Create a Tenant

response = client.tenant.create(tenant_id="my-company")

You can also create a tenant optimised for raw vector embeddings:

response = client.tenant.create(
    tenant_id="my-embeddings-tenant",
    is_embeddings_tenant=True,
    embeddings_dimension=1536,
)

Get Sub-Tenant IDs

sub_tenants = client.tenant.get_sub_tenant_ids(tenant_id="my-company")
# sub_tenants.sub_tenant_ids -> list of sub-tenant ID strings

Get Infrastructure Status

Check whether the tenant's underlying infrastructure is ready:

status = client.tenant.get_infra_status(tenant_id="my-company")

Monitor Tenant Stats

stats = client.tenant.monitor(tenant_id="my-company")

Delete a Tenant

Warning: This is irreversible and permanently removes all data.

client.tenant.delete_tenant(tenant_id="my-company")

Index Your Data

Upload Knowledge (Files)

Upload documents to make them retrievable via natural language search.

with open("report.pdf", "rb") as f:
    result = client.upload.knowledge(
        tenant_id="my-company",
        sub_tenant_id="my-sub-tenant",
        files=[("report.pdf", f, "application/pdf")],
        upsert=True,
    )
    # result.results[0].source_id -> ID you can use later

You can attach metadata to each file. Pass it as a JSON string — each object corresponds to the file at the same index:

import json

file_metadata = json.dumps([
    {
        "id": "doc_a",
        "tenant_metadata": {"dept": "sales"},
        "document_metadata": {"author": "Alice"},
    },
    {
        "id": "doc_b",
        "tenant_metadata": {"dept": "marketing"},
        "document_metadata": {"author": "Bob"},
        "relations": {
            "cortex_source_ids": ["doc_a"],
            "properties": {"relation": "same_upload_batch"},
        },
    },
])

with open("a.pdf", "rb") as f1, open("b.pdf", "rb") as f2:
    result = client.upload.knowledge(
        tenant_id="my-company",
        sub_tenant_id="my-sub-tenant",
        files=[
            ("a.pdf", f1, "application/pdf"),
            ("b.pdf", f2, "application/pdf"),
        ],
        file_metadata=file_metadata,
        upsert=True,
    )

Verify Processing Status

After uploading, check when files have finished indexing:

status = client.upload.verify_processing(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    file_ids=["source-id-1", "source-id-2"],
)
# status.statuses[0].indexing_status -> "queued" | "processing" | "completed" | "errored"

Add Memories

Index free-form text, markdown content, or conversation pairs as searchable memories.

Plain text:

result = client.upload.add_memory(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    upsert=True,
    memories=[
        {
            "text": "User prefers detailed explanations and dark mode",
            "infer": True,
            "user_name": "John",
        }
    ],
)

Markdown:

result = client.upload.add_memory(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    upsert=True,
    memories=[
        {
            "text": "# Meeting Notes\n\n## Key Points\n- Budget approved\n- Launch date: Q2",
            "is_markdown": True,
            "infer": False,
            "title": "Meeting Notes",
        }
    ],
)

User–assistant conversation pairs:

result = client.upload.add_memory(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    upsert=True,
    memories=[
        {
            "user_assistant_pairs": [
                {"user": "What are my preferences?", "assistant": "You prefer dark mode and detailed explanations."},
                {"user": "How do I like my reports?", "assistant": "You prefer weekly summary reports with charts."},
            ],
            "infer": True,
            "user_name": "John",
            "custom_instructions": "Extract user preferences",
        }
    ],
)

Delete a Memory

client.upload.delete_memory(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    memory_id="memory-source-id",
)

Search & Retrieval

Full Recall

Hybrid semantic + keyword search across both knowledge and memories:

results = client.recall.full_recall(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    query="Which mode does the user prefer?",
    alpha=0.8,        # 1.0 = pure semantic, 0.0 = pure keyword
    recency_bias=0,   # 0.0 = no bias, 1.0 = strongly prefer recent
    max_results=10,
)
# results.chunks -> list of VectorStoreChunk
# results.sources -> list of SourceInfo

Recall Preferences

Search only user memory/preference data:

preferences = client.recall.recall_preferences(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    query="dark mode preference",
    max_results=5,
)

Boolean Recall

Exact keyword / phrase / boolean search (BM25):

results = client.recall.boolean_recall(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    query="dark mode",
    operator="phrase",      # "or" | "and" | "phrase"
    max_results=10,
    search_mode="memories", # "sources" | "memories"
)

Q&A (LLM-powered answer)

Ask a question and get a grounded answer generated by an LLM over your indexed content:

answer = client.recall.qna(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    question="What is the user's preferred reporting format?",
    mode="fast",            # "fast" | "thinking"
    search_mode="memories",
    max_chunks=6,
)

You can optionally choose the LLM provider and model:

answer = client.recall.qna(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    question="Summarise the budget decisions from the meeting notes.",
    mode="thinking",
    search_mode="sources",
    max_chunks=10,
    llm_provider="anthropic",
    model="claude-sonnet-4-6",
    temperature=0.2,
    max_tokens=1024,
)

Fetch & Inspect Data

List All Data

List sources (knowledge) or memories with optional filtering and pagination:

# List knowledge sources
sources = client.fetch.list_data(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    kind="knowledge",
    page=1,
    page_size=50,
)

# List user memories
memories = client.fetch.list_data(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    kind="memories",
)

Filter by metadata:

filtered = client.fetch.list_data(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    kind="knowledge",
    filters={"tenant_metadata": {"dept": "sales"}},
)

Fetch Source Content

Retrieve the full content of a specific source by its ID:

source = client.fetch.content(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    source_id="your-source-id",
    mode="content",  # "content" | "url" | "both"
)

Fetch Graph Relations

Retrieve the graph relations (linked sources) for a given source:

relations = client.fetch.graph_relations_by_source_id(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    source_id="your-source-id",
    is_memory=False,
    limit=10,
)

Delete Data

Delete one or more sources (knowledge or memories) by their IDs:

client.data.delete(
    tenant_id="my-company",
    sub_tenant_id="my-sub-tenant",
    ids=["source-id-1", "source-id-2"],
)

API Key Management

Note: This endpoint requires a dashboard session token (obtained via your Hydra DB dashboard login), not a standard API key.

new_key = client.key.create_api_key(
    owner="service-account@myapp.com",
    scopes=["query"],
    env="live",
    prefix="sk",
)
# new_key.full_api_key -> the actual key (only shown once)

Raw Embeddings

Use Hydra DB as a vector store with your own embeddings.

Note: Raw embeddings require a tenant created with is_embeddings_tenant=True and a fixed embeddings_dimension. A standard knowledge tenant does not support raw embedding operations.

Insert Embeddings

client.embeddings.insert(
    tenant_id="my-embeddings-tenant",
    sub_tenant_id="my-sub-tenant",
    upsert=True,
    embeddings=[
        {
            "source_id": "my-doc-001",
            "metadata": {"category": "finance", "year": 2024},
            "embeddings": [
                {"chunk_id": "my-doc-001-chunk-0", "embedding": [0.1, 0.2, 0.3]},  # 1536 dims
                {"chunk_id": "my-doc-001-chunk-1", "embedding": [0.4, 0.5, 0.6]},
            ],
        }
    ],
)

Search by Vector

results = client.embeddings.search(
    tenant_id="my-embeddings-tenant",
    sub_tenant_id="my-sub-tenant",
    query_embedding=[0.1, 0.2, 0.3],  # 1536 dims
    limit=10,
)

Filter Embeddings

# By source
by_source = client.embeddings.filter(
    tenant_id="my-embeddings-tenant",
    sub_tenant_id="my-sub-tenant",
    source_id="my-doc-001",
    limit=50,
)

# By chunk IDs
by_chunks = client.embeddings.filter(
    tenant_id="my-embeddings-tenant",
    sub_tenant_id="my-sub-tenant",
    chunk_ids=["my-doc-001-chunk-0", "my-doc-001-chunk-1"],
)

Delete Embeddings

# Delete all embeddings for a source
client.embeddings.delete(
    tenant_id="my-embeddings-tenant",
    sub_tenant_id="my-sub-tenant",
    source_id="my-doc-001",
)

# Delete specific chunks
client.embeddings.delete(
    tenant_id="my-embeddings-tenant",
    sub_tenant_id="my-sub-tenant",
    chunk_ids=["my-doc-001-chunk-0"],
)

Async Usage

Every method has an async equivalent on AsyncHydraDB. Method names and parameters are identical:

import asyncio
from hydra_db import AsyncHydraDB

async_client = AsyncHydraDB(token="your-api-key")

async def main():
    result = await async_client.recall.full_recall(
        tenant_id="my-company",
        sub_tenant_id="my-sub-tenant",
        query="Which mode does the user prefer?",
        alpha=0.8,
        max_results=10,
    )
    print(result.chunks)

asyncio.run(main())

SDK Method Reference

Method Description
client.tenant.create Create a new tenant (standard or embeddings)
client.tenant.get_sub_tenant_ids List all sub-tenant IDs within a tenant
client.tenant.get_infra_status Check tenant infrastructure readiness
client.tenant.monitor Get tenant usage and stats
client.tenant.delete_tenant Permanently delete a tenant and all its data
client.upload.knowledge Upload files to the knowledge base
client.upload.verify_processing Poll indexing status of uploaded files
client.upload.add_memory Index text, markdown, or conversation pairs as memories
client.upload.delete_memory Delete a specific memory by ID
client.recall.full_recall Hybrid semantic + keyword search
client.recall.recall_preferences Search user memory / preference data only
client.recall.boolean_recall Exact keyword / phrase / boolean search
client.recall.qna LLM-powered question answering over indexed content
client.fetch.list_data List all knowledge sources or memories
client.fetch.content Fetch full content of a source by ID
client.fetch.graph_relations_by_source_id Fetch graph relations for a source
client.data.delete Delete sources or memories by ID
client.key.create_api_key Create a scoped API key (requires dashboard session token)
client.embeddings.insert Store raw vector embeddings (requires embeddings tenant)
client.embeddings.search Vector similarity search
client.embeddings.filter Retrieve embeddings by source or chunk IDs
client.embeddings.delete Delete embeddings by source or chunk IDs

Method Mapping: client.<group>.<method> mirrors api.hydradb.com/<group>/<method>

For example: client.upload.knowledge()POST /ingestion/upload_knowledge


Type Safety & IDE Support

The SDK provides exact type parity with the API:

  • Request parameters — every field (required, optional, type, validation) is reflected in method signatures
  • Response objects — return types are fully typed Pydantic models matching the API JSON schema
  • Nested objects — complex parameters and responses preserve their full structure

Your IDE will automatically provide autocompletion, type-checking, inline documentation, and compile-time validation. Just hit Cmd+Space / Ctrl+Space.


Links

Support

If you have any questions or need help, reach out at founders@hydradb.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hydradb_sdk-0.0.1.tar.gz (83.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hydradb_sdk-0.0.1-py3-none-any.whl (133.3 kB view details)

Uploaded Python 3

File details

Details for the file hydradb_sdk-0.0.1.tar.gz.

File metadata

  • Download URL: hydradb_sdk-0.0.1.tar.gz
  • Upload date:
  • Size: 83.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for hydradb_sdk-0.0.1.tar.gz
Algorithm Hash digest
SHA256 bd4332fe24566655f554e2ec6f5692f7a9f6a67588eed88e0e2e77554e40bc3c
MD5 a5416a6fc164ba229af266734450e752
BLAKE2b-256 c3cd9d33ed4e434e06d0ec19bb60c84a395ee2dc5132123015ec412f82c7f0b0

See more details on using hashes here.

File details

Details for the file hydradb_sdk-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: hydradb_sdk-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 133.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for hydradb_sdk-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d38771cf8a6770e14c82cea2815abbc0dfc3dd7c4977f73106408d023b8e7b61
MD5 97e684a5183349ffef829e75a3c71d49
BLAKE2b-256 1f7f420802f027baf18079f2576367610e0629dd0a48f70c24266177f039f563

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page