Knowledge memory library for long-horizon AI agents — hybrid retrieval over documents, embeddings, and graph relationships

These details have not been verified by PyPI

Project description

Khora

"Khora is the receptacle, the space, the matrix in which all things come to be." — Plato, Timaeus

Khora is a durable knowledge memory library for long-horizon AI agents, with pluggable retrieval engines and storage backends to fit different workloads. It stores what your agent learns — documents, entities, relationships, events, facts — and retrieves it through hybrid search that combines vector similarity, graph traversal, keyword matching, and temporal context. A scheduled "dream phase" then reorganizes the store offline so quality doesn't decay as it grows.

Khora is a library, not an application. You embed it in your agent's process; there is no server. Tooling lives in sibling packages (coming soon - CLI, API/MCP, ontology construction SDK).

Why khora?

Long-horizon agents — copilots, customer-support bots, research assistants, anything that has to remember across sessions — hit four problems that pure vector search doesn't solve:

Ingest is more than chunking. Useful memory needs entities, relationships, and temporal anchors extracted from the raw text. Khora runs a 3-phase ingest pipeline (stage → enrich → expand) with selective LLM extraction (default 70% of chunks, configurable) and cross-batch entity resolution.
Recall is more than cosine. Real questions mix semantic similarity, multi-hop entity reasoning, freshness, and keyword precision. Khora's engines combine all four — vector + Cypher graph traversal + BM25 + RRF fusion + temporal-anchored reranking — and route per query.
Memory drifts. Thresholds calibrated on day one stop working at week ten. Duplicates accumulate from independent ingest batches. Soft-deleted facts pile up. Khora's dream phase audits the store, plans consolidation work (entity dedupe, centroid recompute, fact compaction, event clustering), and applies it under per-op transactions with snapshotted undo records.
Production needs observability. Every recall emits OpenTelemetry spans and metrics through a stable, contract-tested surface; credential fields are SecretStr; free-text never leaks into span attributes. See docs/observability.md.

Khora's bet: instead of one opinionated memory model, ship pluggable engines for different access patterns, share the same storage substrate underneath them, and own the offline-maintenance gap that vector-store wrappers leave to operators.

Engines

Khora ships two production engines and one experimental one. They share the storage substrate (PostgreSQL + pgvector, optionally Neo4j) and the ingest pipeline — pick by access pattern.

VectorCypher (default) — hybrid graph + vector recall

The right choice for knowledge-graph-shaped data: documents that reference entities, entities that reference each other, queries that need multi-hop reasoning ("which engineers worked on projects that shipped with Acme?").

Storage: PostgreSQL + pgvector + Neo4j (or Memgraph / Neptune / AGE).
Retrieval: Vector similarity + Cypher graph traversal + BM25 keyword + RRF fusion + optional PPR (Personalized PageRank) + optional cross-encoder reranking.
Extraction: Selective per-chunk (KET-RAG style; default 70% of chunks get full LLM extraction, the rest get co-occurrence edges).
Best for: Multi-hop reasoning, entity-rich corpora, knowledge bases, "who knows whom"-style queries, anything where graph structure adds signal that flat embeddings miss.
Status: Production-ready on PostgreSQL + Neo4j. Experimental on embedded backends.

Chronicle — temporal-first, no graph DB

The right choice for chat-shaped or event-stream data: conversational memory, support tickets, meeting transcripts, anything where "when" is as important as "what".

Storage: PostgreSQL + pgvector. No graph database required.
Retrieval: 4-channel parallel — semantic vector + BM25 keyword + temporal + entity — fused with abstention signals that flag low-confidence answers before they reach your LLM.
Extraction: SVO events (subject-verb-object), entities, and facts via the same shared ingest pipeline.
Time model: Triple timestamps (valid_from / valid_to / recorded_at) + Ebbinghaus forgetting-curve decay applied to relevance scores.
Best for: Long conversations across sessions, recency-sensitive recall ("what did Alice say last week?"), benchmark-optimized retrieval (LongMemEval, LoCoMo, BEAM), deployments without a graph DB.
Status: Production-ready on PostgreSQL + pgvector. Experimental on embedded backends.

Skeleton (experimental) — minimal-LLM ingestion

Lazy-extraction engine that runs LLM-based entity extraction only on ~10% of chunks ("skeleton core" by PageRank) and expands on demand. Experimental — feature-complete enough for cost-sensitive prototypes and evaluation, but not production-stamped. New work should start with VectorCypher or Chronicle; revisit Skeleton if LLM cost dominates and your queries are mostly time-windowed hybrid search rather than multi-hop entity reasoning.

Engine	Multi-hop entity recall	Temporal recall	Graph DB needed	LLM cost (1k docs)	Status
VectorCypher	✓ Native via Cypher	✓ Temporal detection + reranking	✓ Required	~$0.10–0.20	Production
Chronicle	Limited (entity channel only)	✓ Native bi-temporal + Ebbinghaus decay	— Not required	~$0.15–0.30	Production
Skeleton	Limited (lazy expansion)	✓ Bi-temporal	— Not required	~$0.02–0.05	Experimental

See docs/engines/engine-comparison.md for the detailed comparison: full feature matrix, cost analysis per workload, hybrid-engine patterns, and migration recipes.

Install

pip install khora                 # core (PostgreSQL + pgvector)
pip install khora[neo4j]          # + Neo4j for VectorCypher
pip install khora[sqlite-lance]   # [experimental] embedded SQLite + LanceDB
pip install khora[surrealdb]      # [experimental] unified SurrealDB (single store)
pip install khora[all-backends]   # everything: Neo4j, SurrealDB, SQLite+LanceDB, Weaviate, AGE

See docs/configuration.md for the full extras list.

Production stack

The recommended production stack is PostgreSQL + pgvector + Neo4j — runs VectorCypher (default) and Chronicle from the same database. Set KHORA_DATABASE_URL and KHORA_NEO4J_URL, run uv run alembic upgrade head, then instantiate Khora() with no arguments:

import asyncio
from khora import Khora

async def main() -> None:
    async with Khora() as kb:  # reads KHORA_DATABASE_URL / KHORA_NEO4J_URL
        ns = await kb.create_namespace()  # keyword-only kwargs; no positional name
        await kb.remember(
            "Marie Curie won the Nobel Prize in Physics in 1903.",
            namespace=ns.namespace_id,
        )
        result = await kb.recall("What did Curie win?", namespace=ns.namespace_id)
        print(result.context_text)

asyncio.run(main())

Batch processing

submit_batch() stages documents as PENDING and returns a BatchHandle immediately. A background processor picks them up and calls on_result per document as each completes.

The processor is opt-in. Call kb.start_pending_processor() after connect() on services that write documents. Read-only services do not need it. The processor can be stopped with await kb.stop_pending_processor() and restarted at any time.

async with Khora() as kb:
    kb.start_pending_processor()   # opt-in; write-path services only
    handle = await kb.submit_batch(
        [{"content": "doc 1"}, {"content": "doc 2"}],
        on_result=lambda completed, total, result: print(result),
        namespace=ns_id,
    )
    await handle.wait()

Embedded options (experimental)

Khora ships two zero-infrastructure paths. Both are marked experimental — fine for demos, evaluation, tests, and small single-user CLIs; not yet stamped as a deployment story.

SQLite + LanceDB (pip install khora[sqlite-lance], set KHORA_STORAGE_BACKEND=sqlite_lance) — recommended embedded stack. Covers VectorCypher, Skeleton, and Chronicle via dialect-aware Alembic migrations and LanceDB-backed vector search. Documented scale ceiling: ~1M chunks, ~100k entities, ~500k edges, traversal depth ≤3. Known gaps: no point-in-time queries, partial atomicity in coordinator.transaction(), FTS on chunks only. See configuration.md.
SurrealDB (pip install khora[surrealdb]) — unified relational + vector + graph in one store. Python SDK is on the alpha track (>=2.0.0a1), and KNN (<|K|>) is unreliable in embedded mode (uses brute-force cosine + HNSW fallback). Remote (WebSocket) mode supports atomic multi-statement transactions via conn.transaction() (v0.12.0); embedded / memory modes still operate per-statement-atomic. Suitable for experimentation; not recommended for production.

Quickstart caveat. A literal Khora("memory://") call passes "memory://" as the PostgreSQL URL, not as a backend selector — there is no memory:// URL scheme parsed by khora itself today. To use the embedded path, set KHORA_STORAGE_BACKEND=sqlite_lance (or surrealdb) and the corresponding db_path / connection settings.

Integrations

Khora ships ready-made adapters for the major agentic frameworks. Each adapter is an opt-in optional extra — install only what you use, and the framework itself is imported lazily so importing khora never pulls in a framework you don't need.

Framework	Install	Khora surface
CrewAI	`pip install khora[crewai]`	`KhoraMemory` — drop-in storage backend for CrewAI's unified `Memory`.
LangGraph	`pip install khora[langgraph]`	`KhoraStore` — `BaseStore` implementation for `StateGraph` semantic long-term memory.
Google ADK	`pip install khora[google-adk]`	`KhoraMemoryService` — `BaseMemoryService` drop-in for ADK `Runner`.
OpenAI Agents SDK	`pip install khora[openai-agents]`	`KhoraSession` (`SessionABC`), `khora_recall_tool`, `KhoraMemoryHooks` — compose for session memory, recall-as-tool, and auto-persist.
LlamaIndex	`pip install khora[llamaindex]`	`KhoraRetriever` (async `BaseRetriever`), `KhoraMemoryBlock`, and the deprecated `KhoraChatStore`.

See docs/integrations/ for the full per-adapter docs and the "write your own" Protocol surface.

Maintenance: dream phase

Khora ships an offline maintenance pass ("dream phase") that audits an accumulated namespace and plans consolidation work — entity dedupe, fact compaction, event clustering. Run it on a schedule (cron, Temporal, k8s CronJob) and consume the structured reports through three independently-togglable sinks: file, semantic-event, or telemetry collector.

from khora import Khora, KhoraConfig, DreamConfig

kb = Khora(config=KhoraConfig(dream=DreamConfig(enabled=True)))

# Dry-run — pure observation/planning, zero writes.
result = await kb.dream(namespace_id, mode="dry-run")

for op in result.ops:
    print(op.op_type, op.decision, op.outputs)

Ten operations ship in v0.15.0 across both engines:

Phase	Engine	Operation	Surfaces
1 audit	chronicle	abstention-threshold drift	Configured thresholds vs observed p50/p90/p99
1 audit	chronicle	tombstone audit	Active / inactive / invalidated fact ratios + age distribution
1 audit	vectorcypher	schema drift	New / unused / frequency-changed types vs `ExpertiseConfig`
1 audit	vectorcypher	orphan PageRank	Bottom-5% PR entities flagged as `archive_candidate`
1 audit	vectorcypher	source_chunk_ids audit	Dead UUID counts + array-length distribution
2 planner	vectorcypher	cross-batch entity dedupe	Pairs above the per-type cosine threshold, planned merges
2 planner	vectorcypher	centroid recompute	Per-cluster `centroid` / `re_embed` / `skip_multimodal` decisions
2 planner	vectorcypher	source_chunk_ids GC	Per-entity rewrites dropping dead chunk UUIDs
2 planner	chronicle	memory_facts compaction	Tombstoned rows past `retention_days`
2 planner	chronicle	event clustering	Near-duplicate `chronicle_events` within a sliding window

Default is DreamConfig(enabled=False) — the master switch is opt-in. Both modes are live in v0.15.0: mode="dry-run" emits the plan only; mode="apply" runs the matching apply handler under a per-op transaction with the pre-state snapshotted into undo.json before each mutation. Five guardrails protect the apply path (7-day hard retention floor, KHORA_DREAM_DISABLE_APPLY kill-switch, chunk_id runtime assertion, snapshot-before-delete, advisory-lock-held-through-apply).

See docs/dream-phase.md for the full operator guide: research lineage, configuration surface, sink wiring, telemetry contract, storage substrate, and stability tags.

Observability

khora emits OpenTelemetry spans and metrics through the OTel API. The export path is your choice: vanilla OTel SDK (pip install khora[otel]), Logfire (pip install khora[logfire]), or nothing (zero-cost no-op). Khora never installs a TracerProvider at import time and never sets service.name — those belong to the host application.

pip install khora[otel]
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
export OTEL_SERVICE_NAME="my-app"

from khora.telemetry import configure_telemetry
configure_telemetry()      # honors OTEL_* env vars

See docs/observability.md for the full env-var contract, the precedence rules, vendor recipes (Honeycomb, Datadog, Tempo, etc.), sampling guidance, and the troubleshooting checklist. The complete telemetry surface lives in docs/telemetry-contract.json with the drift gate enforced by tests/unit/telemetry/test_contract.py.

Two separate observability channels live in khora.telemetry:

Spans + metrics via the OTel API (this section).
Structured LLMEvent / StorageEvent / PipelineEvent rows to a dedicated PostgreSQL database when KHORA_TELEMETRY_DATABASE_URL is set. Without it, a NoOpCollector is used (zero cost). Wired by init_telemetry(), independent of configure_telemetry().

Credential fields on KhoraConfig (DSNs, passwords) are pydantic.SecretStr — repr() and config dumps render as '**********'. Callers that need the cleartext must call .get_secret_value() explicitly.

Async logging caveat. Library consumers that import khora without configuring loguru sinks inherit the default sync stderr sink, which blocks the event loop on every log call inside async def. Either call khora.logging_config.setup_logging() (which configures sinks with enqueue=True and registers an atexit drain) or configure your own loguru sinks with enqueue=True explicitly.

Documentation

Start at docs/README.md. Key entry points:

API reference — public Khora surface.
Configuration — KHORA_* env vars and KhoraConfig.
Observability — OTel spans/metrics, [otel]/[logfire] paths, configure_telemetry().
Architecture — how the pieces fit.
Engines — VectorCypher, Skeleton, Chronicle.
Migrations — Alembic workflow for library users.
Downstream consumers — sibling packages and integration guide.

Development

make dev         # start PostgreSQL + Neo4j (Docker)
make test        # pytest with coverage
make format      # ruff format + isort
make lint        # ruff + ty typecheck

See CHANGELOG.md for release history.

License

Licensed under the Apache License, Version 2.0. See LICENSE and NOTICE.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.17.3

May 28, 2026

0.17.2

May 26, 2026

0.17.1

May 26, 2026

0.17.0

May 24, 2026

0.16.4

May 22, 2026

0.16.3

May 21, 2026

0.16.2

May 21, 2026

0.16.1

May 20, 2026

0.16.0

May 19, 2026

This version

0.15.3

May 19, 2026

0.15.2

May 19, 2026

0.15.0

May 17, 2026

0.14.0

May 16, 2026

0.13.0

May 15, 2026

0.12.1

May 15, 2026

0.12.0

May 14, 2026

0.11.2

May 14, 2026

0.11.1

May 14, 2026

0.11.0

May 14, 2026

0.10.8

May 13, 2026

0.10.7

May 13, 2026

0.10.6

May 12, 2026

0.10.5

May 12, 2026

0.10.4

May 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

khora-0.15.3.tar.gz (1.5 MB view details)

Uploaded May 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

khora-0.15.3-py3-none-any.whl (1.0 MB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file khora-0.15.3.tar.gz.

File metadata

Download URL: khora-0.15.3.tar.gz
Upload date: May 19, 2026
Size: 1.5 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for khora-0.15.3.tar.gz
Algorithm	Hash digest
SHA256	`e1696762371f2b624de2012c6d56e5a0cedda955bb50550da554623b6d462674`
MD5	`d3c4a60dfdbf42f86eebec51fb4ad37e`
BLAKE2b-256	`1a72e056533b00ea1f166167b605334c2bdd51445bafcc57aea3e7165aea2246`

See more details on using hashes here.

Provenance

The following attestation bundles were made for khora-0.15.3.tar.gz:

Publisher: release.yml on DeytaHQ/khora

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: khora-0.15.3.tar.gz
- Subject digest: e1696762371f2b624de2012c6d56e5a0cedda955bb50550da554623b6d462674
- Sigstore transparency entry: 1573403191
- Sigstore integration time: May 19, 2026
Source repository:
- Permalink: DeytaHQ/khora@5a2e5c66a5a50ab30441b44545c3b65a513bf799
- Branch / Tag: refs/tags/v0.15.3
- Owner: https://github.com/DeytaHQ
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@5a2e5c66a5a50ab30441b44545c3b65a513bf799
- Trigger Event: push

File details

Details for the file khora-0.15.3-py3-none-any.whl.

File metadata

Download URL: khora-0.15.3-py3-none-any.whl
Upload date: May 19, 2026
Size: 1.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for khora-0.15.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`61c77ebf309ff0bb4187e2f9d97b036e5f7d3d03a1e7b771a6a9ab44006497cf`
MD5	`2f672272da996a6caeedb039ff12c0df`
BLAKE2b-256	`336706f4de64068001775539e9e34ae016009e549b97e10d37f15cf0f82e6057`

See more details on using hashes here.

Provenance

The following attestation bundles were made for khora-0.15.3-py3-none-any.whl:

Publisher: release.yml on DeytaHQ/khora

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: khora-0.15.3-py3-none-any.whl
- Subject digest: 61c77ebf309ff0bb4187e2f9d97b036e5f7d3d03a1e7b771a6a9ab44006497cf
- Sigstore transparency entry: 1573403214
- Sigstore integration time: May 19, 2026
Source repository:
- Permalink: DeytaHQ/khora@5a2e5c66a5a50ab30441b44545c3b65a513bf799
- Branch / Tag: refs/tags/v0.15.3
- Owner: https://github.com/DeytaHQ
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@5a2e5c66a5a50ab30441b44545c3b65a513bf799
- Trigger Event: push

khora 0.15.3

Navigation

Verified details

Project links

Owner

GitHub Statistics

Unverified details

Meta

Project description

Khora

Why khora?

Engines

VectorCypher (default) — hybrid graph + vector recall

Chronicle — temporal-first, no graph DB

Skeleton (experimental) — minimal-LLM ingestion

Install

Production stack

Batch processing

Embedded options (experimental)

Integrations

Maintenance: dream phase

Observability

Documentation

Development

License

Project details

Verified details

Project links

Owner

GitHub Statistics

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance