Knowledge memory library for long-horizon AI agents — hybrid retrieval over documents, embeddings, and graph relationships
Project description
Khora
"Khora is the receptacle, the space, the matrix in which all things come to be." — Plato, Timaeus
Khora is a durable knowledge memory library for long-horizon AI agents, with pluggable retrieval engines and storage backends to fit different workloads. It stores what your agent learns — documents, entities, relationships, events, facts — and retrieves it through hybrid search that combines vector similarity, graph traversal, keyword matching, and temporal context. A scheduled "dream phase" then reorganizes the store offline so quality doesn't decay as it grows.
Khora is a library, not an application. You embed it in your agent's process; there is no server. Tooling lives in sibling packages (coming soon): khora-cli for extraction and search, khora-explorer for ontology construction.
Why khora?
Long-horizon agents — copilots, customer-support bots, research assistants, anything that has to remember across sessions — hit four problems that pure vector search doesn't solve:
- Ingest is more than chunking. Useful memory needs entities, relationships, and temporal anchors extracted from the raw text. Khora runs a 3-phase ingest pipeline (stage → enrich → expand) with selective LLM extraction (default 70% of chunks, configurable) and cross-batch entity resolution.
- Recall is more than cosine. Real questions mix semantic similarity, multi-hop entity reasoning, freshness, and keyword precision. Khora's engines combine all four — vector + Cypher graph traversal + BM25 + RRF fusion + temporal-anchored reranking — and route per query.
- Memory drifts. Thresholds calibrated on day one stop working at week ten. Duplicates accumulate from independent ingest batches. Soft-deleted facts pile up. Khora's dream phase audits the store, plans consolidation work (entity dedupe, centroid recompute, fact compaction, event clustering), and applies it under per-op transactions with snapshotted undo records.
- Production needs observability. Every recall emits OpenTelemetry spans and metrics through a stable, contract-tested surface; credential fields are
SecretStr; free-text never leaks into span attributes. See docs/observability.md.
Khora's bet: instead of one opinionated memory model, ship pluggable engines for different access patterns, share the same storage substrate underneath them, and own the offline-maintenance gap that vector-store wrappers leave to operators.
Engines
Khora ships two production engines and one experimental one. They share the storage substrate (PostgreSQL + pgvector, optionally Neo4j) and the ingest pipeline — pick by access pattern.
VectorCypher (default) — hybrid graph + vector recall
The right choice for knowledge-graph-shaped data: documents that reference entities, entities that reference each other, queries that need multi-hop reasoning ("which engineers worked on projects that shipped with Acme?").
- Storage: PostgreSQL + pgvector + Neo4j (or Memgraph / Neptune / AGE).
- Retrieval: Vector similarity + Cypher graph traversal + BM25 keyword + RRF fusion + optional PPR (Personalized PageRank) + optional cross-encoder reranking.
- Extraction: Selective per-chunk (KET-RAG style; default 70% of chunks get full LLM extraction, the rest get co-occurrence edges).
- Best for: Multi-hop reasoning, entity-rich corpora, knowledge bases, "who knows whom"-style queries, anything where graph structure adds signal that flat embeddings miss.
- Status: Production-ready on PostgreSQL + Neo4j. Experimental on embedded backends.
Chronicle — temporal-first, no graph DB
The right choice for chat-shaped or event-stream data: conversational memory, support tickets, meeting transcripts, anything where "when" is as important as "what".
- Storage: PostgreSQL + pgvector. No graph database required.
- Retrieval: 4-channel parallel — semantic vector + BM25 keyword + temporal + entity — fused with abstention signals that flag low-confidence answers before they reach your LLM.
- Extraction: SVO events (subject-verb-object), entities, and facts via the same shared ingest pipeline.
- Time model: Triple timestamps (
valid_from/valid_to/recorded_at) + Ebbinghaus forgetting-curve decay applied to relevance scores. - Best for: Long conversations across sessions, recency-sensitive recall ("what did Alice say last week?"), benchmark-optimized retrieval (LongMemEval, LoCoMo, BEAM), deployments without a graph DB.
- Status: Production-ready on PostgreSQL + pgvector. Experimental on embedded backends.
Skeleton (experimental) — minimal-LLM ingestion
Lazy-extraction engine that runs LLM-based entity extraction only on ~10% of chunks ("skeleton core" by PageRank) and expands on demand. Experimental — feature-complete enough for cost-sensitive prototypes and evaluation, but not production-stamped. New work should start with VectorCypher or Chronicle; revisit Skeleton if LLM cost dominates and your queries are mostly time-windowed hybrid search rather than multi-hop entity reasoning.
| Engine | Multi-hop entity recall | Temporal recall | Graph DB needed | LLM cost (1k docs) | Status |
|---|---|---|---|---|---|
| VectorCypher | ✓ Native via Cypher | ✓ Temporal detection + reranking | ✓ Required | ~$0.10–0.20 | Production |
| Chronicle | Limited (entity channel only) | ✓ Native bi-temporal + Ebbinghaus decay | — Not required | ~$0.15–0.30 | Production |
| Skeleton | Limited (lazy expansion) | ✓ Bi-temporal | — Not required | ~$0.02–0.05 | Experimental |
See docs/engines/engine-comparison.md for the detailed comparison: full feature matrix, cost analysis per workload, hybrid-engine patterns, and migration recipes.
Install
pip install khora # core (PostgreSQL + pgvector)
pip install khora[neo4j] # + Neo4j for VectorCypher
pip install khora[sqlite-lance] # [experimental] embedded SQLite + LanceDB
pip install khora[surrealdb] # [experimental] unified SurrealDB (single store)
pip install khora[all-backends] # everything: Neo4j, SurrealDB, SQLite+LanceDB, Weaviate, AGE
See docs/configuration.md for the full extras list.
Production stack
The recommended production stack is PostgreSQL + pgvector + Neo4j — runs VectorCypher (default) and Chronicle from the same database. Set KHORA_DATABASE_URL and KHORA_NEO4J_URL, run uv run alembic upgrade head, then instantiate Khora() with no arguments:
import asyncio
from khora import Khora
async def main() -> None:
async with Khora() as kb: # reads KHORA_DATABASE_URL / KHORA_NEO4J_URL
ns = await kb.create_namespace() # keyword-only kwargs; no positional name
await kb.remember(
"Marie Curie won the Nobel Prize in Physics in 1903.",
namespace=ns.namespace_id,
)
result = await kb.recall("What did Curie win?", namespace=ns.namespace_id)
print(result.context_text)
asyncio.run(main())
Batch processing
submit_batch() stages documents as PENDING and returns a BatchHandle immediately. A background processor picks them up and calls on_result per document as each completes.
The processor is opt-in. Call kb.start_pending_processor() after connect() on services that write documents. Read-only services do not need it. The processor can be stopped with await kb.stop_pending_processor() and restarted at any time.
async with Khora() as kb:
kb.start_pending_processor() # opt-in; write-path services only
handle = await kb.submit_batch(
[{"content": "doc 1"}, {"content": "doc 2"}],
on_result=lambda completed, total, result: print(result),
namespace=ns_id,
)
await handle.wait()
Embedded options (experimental)
Khora ships two zero-infrastructure paths. Both are marked experimental — fine for demos, evaluation, tests, and small single-user CLIs; not yet stamped as a deployment story.
- SQLite + LanceDB (
pip install khora[sqlite-lance], setKHORA_STORAGE_BACKEND=sqlite_lance) — recommended embedded stack. Covers VectorCypher, Skeleton, and Chronicle via dialect-aware Alembic migrations and LanceDB-backed vector search. Documented scale ceiling: ~1M chunks, ~100k entities, ~500k edges, traversal depth ≤3. Known gaps: no point-in-time queries, partial atomicity incoordinator.transaction(), FTS on chunks only. See configuration.md. - SurrealDB (
pip install khora[surrealdb]) — unified relational + vector + graph in one store. Python SDK is on the alpha track (>=2.0.0a1), and KNN (<|K|>) is unreliable in embedded mode (uses brute-force cosine + HNSW fallback). Remote (WebSocket) mode supports atomic multi-statement transactions viaconn.transaction()(v0.12.0); embedded / memory modes still operate per-statement-atomic. Suitable for experimentation; not recommended for production.
Quickstart caveat. A literal
Khora("memory://")call passes"memory://"as the PostgreSQL URL, not as a backend selector — there is nomemory://URL scheme parsed by khora itself today. To use the embedded path, setKHORA_STORAGE_BACKEND=sqlite_lance(orsurrealdb) and the correspondingdb_path/ connection settings.
Integrations
Khora ships ready-made adapters for the major agentic frameworks. Each adapter is an opt-in optional extra — install only what you use, and the framework itself is imported lazily so importing khora never pulls in a framework you don't need.
| Framework | Install | Khora surface |
|---|---|---|
| CrewAI | pip install khora[crewai] |
KhoraMemory — drop-in storage backend for CrewAI's unified Memory. |
| LangGraph | pip install khora[langgraph] |
KhoraStore — BaseStore implementation for StateGraph semantic long-term memory. |
| Google ADK | pip install khora[google-adk] |
KhoraMemoryService — BaseMemoryService drop-in for ADK Runner. |
| OpenAI Agents SDK | pip install khora[openai-agents] |
KhoraSession (SessionABC), khora_recall_tool, KhoraMemoryHooks — compose for session memory, recall-as-tool, and auto-persist. |
| LlamaIndex | pip install khora[llamaindex] |
KhoraRetriever (async BaseRetriever), KhoraMemoryBlock, and the deprecated KhoraChatStore. |
See docs/integrations/ for the full per-adapter docs and the "write your own" Protocol surface.
Maintenance: dream phase
Khora ships an offline maintenance pass ("dream phase") that audits an accumulated namespace and plans consolidation work — entity dedupe, fact compaction, event clustering. Run it on a schedule (cron, Temporal, k8s CronJob) and consume the structured reports through three independently-togglable sinks: file, semantic-event, or telemetry collector.
from khora import Khora, KhoraConfig, DreamConfig
kb = Khora(config=KhoraConfig(dream=DreamConfig(enabled=True)))
# Dry-run — pure observation/planning, zero writes.
result = await kb.dream(namespace_id, mode="dry-run")
for op in result.ops:
print(op.op_type, op.decision, op.outputs)
Ten operations ship in v0.15.0 across both engines:
| Phase | Engine | Operation | Surfaces |
|---|---|---|---|
| 1 audit | chronicle | abstention-threshold drift | Configured thresholds vs observed p50/p90/p99 |
| 1 audit | chronicle | tombstone audit | Active / inactive / invalidated fact ratios + age distribution |
| 1 audit | vectorcypher | schema drift | New / unused / frequency-changed types vs ExpertiseConfig |
| 1 audit | vectorcypher | orphan PageRank | Bottom-5% PR entities flagged as archive_candidate |
| 1 audit | vectorcypher | source_chunk_ids audit | Dead UUID counts + array-length distribution |
| 2 planner | vectorcypher | cross-batch entity dedupe | Pairs above the per-type cosine threshold, planned merges |
| 2 planner | vectorcypher | centroid recompute | Per-cluster centroid / re_embed / skip_multimodal decisions |
| 2 planner | vectorcypher | source_chunk_ids GC | Per-entity rewrites dropping dead chunk UUIDs |
| 2 planner | chronicle | memory_facts compaction | Tombstoned rows past retention_days |
| 2 planner | chronicle | event clustering | Near-duplicate chronicle_events within a sliding window |
Default is DreamConfig(enabled=False) — the master switch is opt-in. Both modes are live in v0.15.0: mode="dry-run" emits the plan only; mode="apply" runs the matching apply handler under a per-op transaction with the pre-state snapshotted into undo.json before each mutation. Five guardrails protect the apply path (7-day hard retention floor, KHORA_DREAM_DISABLE_APPLY kill-switch, chunk_id runtime assertion, snapshot-before-delete, advisory-lock-held-through-apply).
See docs/dream-phase.md for the full operator guide: research lineage, configuration surface, sink wiring, telemetry contract, storage substrate, and stability tags.
Observability
khora emits OpenTelemetry spans and metrics through the OTel API.
The export path is your choice: vanilla OTel SDK (pip install khora[otel]), Logfire
(pip install khora[logfire]), or nothing (zero-cost no-op). Khora
never installs a TracerProvider at import time and never sets
service.name — those belong to the host application.
pip install khora[otel]
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
export OTEL_SERVICE_NAME="my-app"
from khora.telemetry import configure_telemetry
configure_telemetry() # honors OTEL_* env vars
See docs/observability.md for the full env-var
contract, the precedence rules, vendor recipes (Honeycomb, Datadog,
Tempo, etc.), sampling guidance, and the troubleshooting checklist.
The complete telemetry surface lives in
docs/telemetry-contract.json with the
drift gate enforced by tests/unit/telemetry/test_contract.py.
Two separate observability channels live in khora.telemetry:
- Spans + metrics via the OTel API (this section).
- Structured
LLMEvent/StorageEvent/PipelineEventrows to a dedicated PostgreSQL database whenKHORA_TELEMETRY_DATABASE_URLis set. Without it, aNoOpCollectoris used (zero cost). Wired byinit_telemetry(), independent ofconfigure_telemetry().
Credential fields on KhoraConfig (DSNs, passwords) are
pydantic.SecretStr — repr() and config dumps render as
'**********'. Callers that need the cleartext must call
.get_secret_value() explicitly.
Async logging caveat. Library consumers that import khora without
configuring loguru sinks inherit the default sync stderr sink, which
blocks the event loop on every log call inside async def. Either
call khora.logging_config.setup_logging() (which configures sinks
with enqueue=True and registers an atexit drain) or configure
your own loguru sinks with enqueue=True explicitly.
Documentation
Start at docs/README.md. Key entry points:
- API reference — public
Khorasurface. - Configuration —
KHORA_*env vars andKhoraConfig. - Observability — OTel spans/metrics,
[otel]/[logfire]paths,configure_telemetry(). - Architecture — how the pieces fit.
- Engines — VectorCypher, Skeleton, Chronicle.
- Migrations — Alembic workflow for library users.
- Downstream consumers — sibling packages and integration guide.
Development
make dev # start PostgreSQL + Neo4j (Docker)
make test # pytest with coverage
make format # ruff format + isort
make lint # ruff + ty typecheck
See CHANGELOG.md for release history.
License
Copyright 2026 AllTheData Inc.
Licensed under the Apache License, Version 2.0. See LICENSE and NOTICE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file khora-0.15.0.tar.gz.
File metadata
- Download URL: khora-0.15.0.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
660db487ec31d6b2bddbc2940678ba308fc8e983ca9873c8a0ede5ed6f8915cb
|
|
| MD5 |
74a720d704d70650f73cd7b5cc41bf28
|
|
| BLAKE2b-256 |
5838bd570c4cdbfedf354fcbe3ff99203a5e43364a8736c7cf863d48436778c5
|
Provenance
The following attestation bundles were made for khora-0.15.0.tar.gz:
Publisher:
release.yml on DeytaHQ/khora
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khora-0.15.0.tar.gz -
Subject digest:
660db487ec31d6b2bddbc2940678ba308fc8e983ca9873c8a0ede5ed6f8915cb - Sigstore transparency entry: 1563112787
- Sigstore integration time:
-
Permalink:
DeytaHQ/khora@8a266760cec7f09d498d92521d2b75e4d3d6e324 -
Branch / Tag:
refs/tags/v0.15.0 - Owner: https://github.com/DeytaHQ
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@8a266760cec7f09d498d92521d2b75e4d3d6e324 -
Trigger Event:
push
-
Statement type:
File details
Details for the file khora-0.15.0-py3-none-any.whl.
File metadata
- Download URL: khora-0.15.0-py3-none-any.whl
- Upload date:
- Size: 981.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65566ec94afb119d8e1f2e78ac13be0b102fb7506ea18fd464c7ed70966a3098
|
|
| MD5 |
d26a6a99a952f0e4f6697a49a51815c8
|
|
| BLAKE2b-256 |
d00a1ad9759cb3956132a45843f6098baca759dfa8301439fd9d0b7df54de094
|
Provenance
The following attestation bundles were made for khora-0.15.0-py3-none-any.whl:
Publisher:
release.yml on DeytaHQ/khora
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
khora-0.15.0-py3-none-any.whl -
Subject digest:
65566ec94afb119d8e1f2e78ac13be0b102fb7506ea18fd464c7ed70966a3098 - Sigstore transparency entry: 1563112853
- Sigstore integration time:
-
Permalink:
DeytaHQ/khora@8a266760cec7f09d498d92521d2b75e4d3d6e324 -
Branch / Tag:
refs/tags/v0.15.0 - Owner: https://github.com/DeytaHQ
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@8a266760cec7f09d498d92521d2b75e4d3d6e324 -
Trigger Event:
push
-
Statement type: