Knowledge memory library for long-horizon AI agents — hybrid retrieval over documents, embeddings, and graph relationships
Project description
Khora
"Khora is the receptacle, the space, the matrix in which all things come to be." — Plato, Timaeus
Khora is a knowledge memory library for long-horizon AI agents, with pluggable retrieval engines and storage backends to fit different workloads. It stores knowledge as documents, embeddings, and graph relationships, and retrieves it through hybrid search (vector + graph + keyword), reranking, and temporal context.
Khora is a library, not an application. Tooling lives in sibling packages (coming soon...):
- khora-cli (to be released soon) — CLI tooling for extraction and search.
- khora-explorer (to be released soon) — tooling for ontology construction and exploration.
Install
pip install khora # core (PostgreSQL + pgvector)
pip install khora[sqlite-lance] # [experimental] embedded SQLite + LanceDB
pip install khora[surrealdb] # [experimental] unified SurrealDB (single store)
pip install khora[all-backends] # everything: Neo4j, SurrealDB, SQLite+LanceDB, Weaviate, AGE
See docs/configuration.md for the full extras list.
Production stack
The production-ready combination in v0.9.0 is PostgreSQL + pgvector + Neo4j:
- VectorCypher (default engine) — runs on PostgreSQL + pgvector + Neo4j.
- Chronicle — runs on PostgreSQL + pgvector (no graph DB required).
- Skeleton — available; PostgreSQL + pgvector (no graph DB required).
Set KHORA_DATABASE_URL and KHORA_NEO4J_URL, run uv run alembic upgrade head, then instantiate Khora() with no arguments:
import asyncio
from khora import Khora
async def main() -> None:
async with Khora() as kb: # reads KHORA_DATABASE_URL / KHORA_NEO4J_URL
ns = await kb.create_namespace("demo")
await kb.remember(
"Marie Curie won the Nobel Prize in Physics in 1903.",
namespace=ns.namespace_id,
)
result = await kb.recall("What did Curie win?", namespace=ns.namespace_id)
print(result.context_text)
asyncio.run(main())
Batch processing
submit_batch() stages documents as PENDING and returns a BatchHandle immediately. A background processor picks them up and calls on_result per document as each completes.
The processor is opt-in. Call kb.start_pending_processor() after connect() on services that write documents. Read-only services do not need it. The processor can be stopped with await kb.stop_pending_processor() and restarted at any time.
async with Khora() as kb:
kb.start_pending_processor() # opt-in; write-path services only
handle = await kb.submit_batch(
[{"content": "doc 1"}, {"content": "doc 2"}],
on_result=lambda completed, total, result: print(result),
namespace=ns_id,
)
await handle.wait()
Embedded options (experimental)
Khora ships two zero-infrastructure paths. Both are marked experimental in v0.9.0 — fine for demos, evaluation, tests, and small single-user CLIs; not yet stamped as a deployment story.
- SQLite + LanceDB (
pip install khora[sqlite-lance], setKHORA_STORAGE_BACKEND=sqlite_lance) — recommended embedded stack. Covers VectorCypher, Skeleton, and Chronicle via dialect-aware Alembic migrations and LanceDB-backed vector search. Documented scale ceiling: ~1M chunks, ~100k entities, ~500k edges, traversal depth ≤3. Known gaps: no point-in-time queries, partial atomicity incoordinator.transaction(), FTS on chunks only. See configuration.md. - SurrealDB (
pip install khora[surrealdb]) — unified relational + vector + graph in one store. Python SDK is on the alpha track (>=2.0.0a1), and KNN (<|K|>) is unreliable in embedded mode (uses brute-force cosine + HNSW fallback). Suitable for experimentation; not recommended for production.
Quickstart caveat. A literal
Khora("memory://")call passes"memory://"as the PostgreSQL URL, not as a backend selector — there is nomemory://URL scheme parsed by khora itself today. To use the embedded path, setKHORA_STORAGE_BACKEND=sqlite_lance(orsurrealdb) and the correspondingdb_path/ connection settings. Routing a truememory://URI to the SQLite+LanceDB stack is tracked for v0.10.
Observability
khora emits OpenTelemetry spans and metrics via Logfire and records structured LLMEvent / StorageEvent / PipelineEvent rows to PostgreSQL when a collector is configured. Both integrations are opt-in — without them, all instrumentation is a zero-cost no-op.
-
Public surface is documented in
docs/telemetry-contract.json(with explainer atdocs/telemetry-contract.md). It lists every public span, metric, pipeline stage, event-type field, andkhora.telemetry.__all__export. Items taggedstability: publicare part of khora's API surface and follow standard semver — breaking changes require a major version bump. Drift is enforced in CI viatests/unit/telemetry/test_contract.py. -
OTel semantic conventions apply to attributes:
gen_ai.*for LLM calls,db.*for storage,code.*for stack info. Vendor-neutral over the OTel exporter chain. -
Logfire integration is opt-in via the
[logfire]extra:pip install khora[logfire]
import logfire from khora import Khora logfire.configure(service_name="my-service") # khora's @trace decorators and trace_span() context managers # now emit spans automatically; metrics like khora.memory.recall.duration, # khora.llm.tokens, khora.llm.cost_usd, khora.chronicle.abstention_signal # are exported on the standard OTel cadence.
Without the
logfireextra installed,trace_span()yields a no-op andmetric_*registrations short-circuit. -
Structured event recording is opt-in via
KHORA_TELEMETRY_DATABASE_URL(PostgreSQL). When set,TelemetryCollectorwritesLLMEvent/StorageEvent/PipelineEventrows for downstream cost tracking and incident reconstruction. Without it,NoOpCollectoris used (zero cost). -
Async logging caveat. Library consumers that import khora without configuring loguru sinks inherit the default sync stderr sink, which blocks the event loop on every log call inside
async def. Either callkhora.logging_config.setup_logging()(which configures sinks withenqueue=Trueand registers anatexitdrain) or configure your own loguru sinks withenqueue=Trueexplicitly.
Documentation
Start at docs/README.md. Key entry points:
- API reference — public
Khorasurface. - Configuration —
KHORA_*env vars andKhoraConfig. - Architecture — how the pieces fit.
- Engines — VectorCypher, Skeleton, Chronicle.
- Migrations — Alembic workflow for library users.
- Downstream consumers — sibling packages and integration guide.
Development
make dev # start PostgreSQL + Neo4j (Docker)
make test # pytest with coverage
make format # ruff format + isort
make lint # ruff + ty typecheck
See CHANGELOG.md for release history.
License
Copyright 2026 AllTheData Inc.
Licensed under the Apache License, Version 2.0. See LICENSE and NOTICE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file khora-0.10.6.tar.gz.
File metadata
- Download URL: khora-0.10.6.tar.gz
- Upload date:
- Size: 1.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9db38f9e1a48fd9a326499b92ef265856430b34c83917202a6d1e03b65a8942
|
|
| MD5 |
911179a538733269c37473910d9beab5
|
|
| BLAKE2b-256 |
210bf58d9c85571677bcc2375262b18ab53f324f2cb2bb6c7bb961ff4b7c83d9
|
Provenance
The following attestation bundles were made for khora-0.10.6.tar.gz:
Publisher:
release.yml on DeytaHQ/khora
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khora-0.10.6.tar.gz -
Subject digest:
e9db38f9e1a48fd9a326499b92ef265856430b34c83917202a6d1e03b65a8942 - Sigstore transparency entry: 1520829683
- Sigstore integration time:
-
Permalink:
DeytaHQ/khora@c8e5d64a1c9d64e19a6c44e32314147d83b75b5d -
Branch / Tag:
refs/tags/v0.10.6 - Owner: https://github.com/DeytaHQ
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c8e5d64a1c9d64e19a6c44e32314147d83b75b5d -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file khora-0.10.6-py3-none-any.whl.
File metadata
- Download URL: khora-0.10.6-py3-none-any.whl
- Upload date:
- Size: 727.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b9577a56327091e5db14f15729e2d465193c5f4a85370d7efca73bc59ac767e
|
|
| MD5 |
54a7f1d492035d9319ca304127b3422e
|
|
| BLAKE2b-256 |
2e7d4d905b08aa2aaff85dac06ef21c4b003235b87ce5fc7abfc6d0330e6f202
|
Provenance
The following attestation bundles were made for khora-0.10.6-py3-none-any.whl:
Publisher:
release.yml on DeytaHQ/khora
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khora-0.10.6-py3-none-any.whl -
Subject digest:
0b9577a56327091e5db14f15729e2d465193c5f4a85370d7efca73bc59ac767e - Sigstore transparency entry: 1520829875
- Sigstore integration time:
-
Permalink:
DeytaHQ/khora@c8e5d64a1c9d64e19a6c44e32314147d83b75b5d -
Branch / Tag:
refs/tags/v0.10.6 - Owner: https://github.com/DeytaHQ
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c8e5d64a1c9d64e19a6c44e32314147d83b75b5d -
Trigger Event:
workflow_dispatch
-
Statement type: