The official Python SDK for the Hydra DB (hydradb.com)
Project description
Hydra DB Python SDK - hydradb.com
The official Python SDK for the Hydra DB platform. Build powerful, context-aware AI applications in your Python applications.
Hydra DB is your plug-and-play memory infrastructure. It powers intelligent, context-aware retrieval for any AI app or agent. Whether you're building a customer support bot, research copilot, or internal knowledge assistant.
Learn more about the SDK from our docs
Core features
- Dynamic retrieval and querying that always retrieves the most relevant context
- Built-in long-term memory that evolves with every user interaction
- Personalization hooks for user preferences, intent, and history
- Raw embeddings support for bring-your-own vector workflows
- Developer-first SDK with full type safety and IDE autocompletion
Getting started
Installation
pip install hydra-db-python
Client setup
Both synchronous and asynchronous clients are available. Use AsyncHydraDB for async/await patterns and HydraDB for traditional synchronous workflows. Both expose the exact same set of methods.
import os
from hydra_db import HydraDB, AsyncHydraDB
api_key = os.environ["HYDRA_DB_API_KEY"]
# Sync client
client = HydraDB(token=api_key)
# Async client
async_client = AsyncHydraDB(token=api_key)
Tenant Management
A tenant is a single isolated database. Within it you can create further isolated collections called sub-tenants. Learn more
Create a Tenant
response = client.tenant.create(tenant_id="my-company")
You can also create a tenant optimised for raw vector embeddings:
response = client.tenant.create(
tenant_id="my-embeddings-tenant",
is_embeddings_tenant=True,
embeddings_dimension=1536,
)
Get Sub-Tenant IDs
sub_tenants = client.tenant.get_sub_tenant_ids(tenant_id="my-company")
# sub_tenants.sub_tenant_ids -> list of sub-tenant ID strings
Get Infrastructure Status
Check whether the tenant's underlying infrastructure is ready:
status = client.tenant.get_infra_status(tenant_id="my-company")
Monitor Tenant Stats
stats = client.tenant.monitor(tenant_id="my-company")
Delete a Tenant
Warning: This is irreversible and permanently removes all data.
client.tenant.delete_tenant(tenant_id="my-company")
Index Your Data
Upload Knowledge (Files)
Upload documents to make them retrievable via natural language search.
with open("report.pdf", "rb") as f:
result = client.upload.knowledge(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
files=[("report.pdf", f, "application/pdf")],
upsert=True,
)
# result.results[0].source_id -> ID you can use later
You can attach metadata to each file. Pass it as a JSON string — each object corresponds to the file at the same index:
import json
file_metadata = json.dumps([
{
"id": "doc_a",
"tenant_metadata": {"dept": "sales"},
"document_metadata": {"author": "Alice"},
},
{
"id": "doc_b",
"tenant_metadata": {"dept": "marketing"},
"document_metadata": {"author": "Bob"},
"relations": {
"cortex_source_ids": ["doc_a"],
"properties": {"relation": "same_upload_batch"},
},
},
])
with open("a.pdf", "rb") as f1, open("b.pdf", "rb") as f2:
result = client.upload.knowledge(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
files=[
("a.pdf", f1, "application/pdf"),
("b.pdf", f2, "application/pdf"),
],
file_metadata=file_metadata,
upsert=True,
)
Verify Processing Status
After uploading, check when files have finished indexing:
status = client.upload.verify_processing(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
file_ids=["source-id-1", "source-id-2"],
)
# status.statuses[0].indexing_status -> "queued" | "processing" | "completed" | "errored"
Add Memories
Index free-form text, markdown content, or conversation pairs as searchable memories.
Plain text:
result = client.upload.add_memory(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
upsert=True,
memories=[
{
"text": "User prefers detailed explanations and dark mode",
"infer": True,
"user_name": "John",
}
],
)
Markdown:
result = client.upload.add_memory(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
upsert=True,
memories=[
{
"text": "# Meeting Notes\n\n## Key Points\n- Budget approved\n- Launch date: Q2",
"is_markdown": True,
"infer": False,
"title": "Meeting Notes",
}
],
)
User–assistant conversation pairs:
result = client.upload.add_memory(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
upsert=True,
memories=[
{
"user_assistant_pairs": [
{"user": "What are my preferences?", "assistant": "You prefer dark mode and detailed explanations."},
{"user": "How do I like my reports?", "assistant": "You prefer weekly summary reports with charts."},
],
"infer": True,
"user_name": "John",
"custom_instructions": "Extract user preferences",
}
],
)
Delete a Memory
client.upload.delete_memory(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
memory_id="memory-source-id",
)
Search & Retrieval
Full Recall
Hybrid semantic + keyword search across both knowledge and memories:
results = client.recall.full_recall(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
query="Which mode does the user prefer?",
alpha=0.8, # 1.0 = pure semantic, 0.0 = pure keyword
recency_bias=0, # 0.0 = no bias, 1.0 = strongly prefer recent
max_results=10,
)
# results.chunks -> list of VectorStoreChunk
# results.sources -> list of SourceInfo
Recall Preferences
Search only user memory/preference data:
preferences = client.recall.recall_preferences(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
query="dark mode preference",
max_results=5,
)
Boolean Recall
Exact keyword / phrase / boolean search (BM25):
results = client.recall.boolean_recall(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
query="dark mode",
operator="phrase", # "or" | "and" | "phrase"
max_results=10,
search_mode="memories", # "sources" | "memories"
)
Q&A (LLM-powered answer)
Ask a question and get a grounded answer generated by an LLM over your indexed content:
answer = client.recall.qna(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
question="What is the user's preferred reporting format?",
mode="fast", # "fast" | "thinking"
search_mode="memories",
max_chunks=6,
)
You can optionally choose the LLM provider and model:
answer = client.recall.qna(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
question="Summarise the budget decisions from the meeting notes.",
mode="thinking",
search_mode="sources",
max_chunks=10,
llm_provider="anthropic",
model="claude-sonnet-4-6",
temperature=0.2,
max_tokens=1024,
)
Fetch & Inspect Data
List All Data
List sources (knowledge) or memories with optional filtering and pagination:
# List knowledge sources
sources = client.fetch.list_data(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
kind="knowledge",
page=1,
page_size=50,
)
# List user memories
memories = client.fetch.list_data(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
kind="memories",
)
Filter by metadata:
filtered = client.fetch.list_data(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
kind="knowledge",
filters={"tenant_metadata": {"dept": "sales"}},
)
Fetch Source Content
Retrieve the full content of a specific source by its ID:
source = client.fetch.content(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
source_id="your-source-id",
mode="content", # "content" | "url" | "both"
)
Fetch Graph Relations
Retrieve the graph relations (linked sources) for a given source:
relations = client.fetch.graph_relations_by_source_id(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
source_id="your-source-id",
is_memory=False,
limit=10,
)
Delete Data
Delete one or more sources (knowledge or memories) by their IDs:
client.data.delete(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
ids=["source-id-1", "source-id-2"],
)
API Key Management
Note: This endpoint requires a dashboard session token (obtained via your Hydra DB dashboard login), not a standard API key.
new_key = client.key.create_api_key(
owner="service-account@myapp.com",
scopes=["query"],
env="live",
prefix="sk",
)
# new_key.full_api_key -> the actual key (only shown once)
Raw Embeddings
Use Hydra DB as a vector store with your own embeddings.
Note: Raw embeddings require a tenant created with
is_embeddings_tenant=Trueand a fixedembeddings_dimension. A standard knowledge tenant does not support raw embedding operations.
Insert Embeddings
client.embeddings.insert(
tenant_id="my-embeddings-tenant",
sub_tenant_id="my-sub-tenant",
upsert=True,
embeddings=[
{
"source_id": "my-doc-001",
"metadata": {"category": "finance", "year": 2024},
"embeddings": [
{"chunk_id": "my-doc-001-chunk-0", "embedding": [0.1, 0.2, 0.3]}, # 1536 dims
{"chunk_id": "my-doc-001-chunk-1", "embedding": [0.4, 0.5, 0.6]},
],
}
],
)
Search by Vector
results = client.embeddings.search(
tenant_id="my-embeddings-tenant",
sub_tenant_id="my-sub-tenant",
query_embedding=[0.1, 0.2, 0.3], # 1536 dims
limit=10,
)
Filter Embeddings
# By source
by_source = client.embeddings.filter(
tenant_id="my-embeddings-tenant",
sub_tenant_id="my-sub-tenant",
source_id="my-doc-001",
limit=50,
)
# By chunk IDs
by_chunks = client.embeddings.filter(
tenant_id="my-embeddings-tenant",
sub_tenant_id="my-sub-tenant",
chunk_ids=["my-doc-001-chunk-0", "my-doc-001-chunk-1"],
)
Delete Embeddings
# Delete all embeddings for a source
client.embeddings.delete(
tenant_id="my-embeddings-tenant",
sub_tenant_id="my-sub-tenant",
source_id="my-doc-001",
)
# Delete specific chunks
client.embeddings.delete(
tenant_id="my-embeddings-tenant",
sub_tenant_id="my-sub-tenant",
chunk_ids=["my-doc-001-chunk-0"],
)
Async Usage
Every method has an async equivalent on AsyncHydraDB. Method names and parameters are identical:
import asyncio
from hydra_db import AsyncHydraDB
async_client = AsyncHydraDB(token="your-api-key")
async def main():
result = await async_client.recall.full_recall(
tenant_id="my-company",
sub_tenant_id="my-sub-tenant",
query="Which mode does the user prefer?",
alpha=0.8,
max_results=10,
)
print(result.chunks)
asyncio.run(main())
SDK Method Reference
| Method | Description |
|---|---|
client.tenant.create |
Create a new tenant (standard or embeddings) |
client.tenant.get_sub_tenant_ids |
List all sub-tenant IDs within a tenant |
client.tenant.get_infra_status |
Check tenant infrastructure readiness |
client.tenant.monitor |
Get tenant usage and stats |
client.tenant.delete_tenant |
Permanently delete a tenant and all its data |
client.upload.knowledge |
Upload files to the knowledge base |
client.upload.verify_processing |
Poll indexing status of uploaded files |
client.upload.add_memory |
Index text, markdown, or conversation pairs as memories |
client.upload.delete_memory |
Delete a specific memory by ID |
client.recall.full_recall |
Hybrid semantic + keyword search |
client.recall.recall_preferences |
Search user memory / preference data only |
client.recall.boolean_recall |
Exact keyword / phrase / boolean search |
client.recall.qna |
LLM-powered question answering over indexed content |
client.fetch.list_data |
List all knowledge sources or memories |
client.fetch.content |
Fetch full content of a source by ID |
client.fetch.graph_relations_by_source_id |
Fetch graph relations for a source |
client.data.delete |
Delete sources or memories by ID |
client.key.create_api_key |
Create a scoped API key (requires dashboard session token) |
client.embeddings.insert |
Store raw vector embeddings (requires embeddings tenant) |
client.embeddings.search |
Vector similarity search |
client.embeddings.filter |
Retrieve embeddings by source or chunk IDs |
client.embeddings.delete |
Delete embeddings by source or chunk IDs |
Method Mapping:
client.<group>.<method>mirrorsapi.hydradb.com/<group>/<method>For example:
client.upload.knowledge()→POST /ingestion/upload_knowledge
Type Safety & IDE Support
The SDK provides exact type parity with the API:
- Request parameters — every field (required, optional, type, validation) is reflected in method signatures
- Response objects — return types are fully typed Pydantic models matching the API JSON schema
- Nested objects — complex parameters and responses preserve their full structure
Your IDE will automatically provide autocompletion, type-checking, inline documentation, and compile-time validation. Just hit Cmd+Space / Ctrl+Space.
Links
- Homepage: hydradb.com
- Documentation: docs.hydradb.com
- API Reference: docs.hydradb.com/api-reference/introduction
Support
If you have any questions or need help, reach out at founders@hydradb.com.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hydradb_sdk-0.0.1.tar.gz.
File metadata
- Download URL: hydradb_sdk-0.0.1.tar.gz
- Upload date:
- Size: 83.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd4332fe24566655f554e2ec6f5692f7a9f6a67588eed88e0e2e77554e40bc3c
|
|
| MD5 |
a5416a6fc164ba229af266734450e752
|
|
| BLAKE2b-256 |
c3cd9d33ed4e434e06d0ec19bb60c84a395ee2dc5132123015ec412f82c7f0b0
|
File details
Details for the file hydradb_sdk-0.0.1-py3-none-any.whl.
File metadata
- Download URL: hydradb_sdk-0.0.1-py3-none-any.whl
- Upload date:
- Size: 133.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d38771cf8a6770e14c82cea2815abbc0dfc3dd7c4977f73106408d023b8e7b61
|
|
| MD5 |
97e684a5183349ffef829e75a3c71d49
|
|
| BLAKE2b-256 |
1f7f420802f027baf18079f2576367610e0629dd0a48f70c24266177f039f563
|