Skip to main content

Knowledge memory library for long-horizon AI agents — hybrid retrieval over documents, embeddings, and graph relationships

Project description

Khora

CI Release codecov Python 3.13+ License: Apache 2.0

"Khora is the receptacle, the space, the matrix in which all things come to be." — Plato, Timaeus

Khora is a knowledge memory library for long-horizon AI agents, with pluggable retrieval engines and storage backends to fit different workloads. It stores knowledge as documents, embeddings, and graph relationships, and retrieves it through hybrid search (vector + graph + keyword), reranking, and temporal context.

Khora is a library, not an application. Tooling lives in sibling packages (coming soon...):

  • khora-cli (to be released soon) — CLI tooling for extraction and search.
  • khora-explorer (to be released soon) — tooling for ontology construction and exploration.

Install

pip install khora                 # core (PostgreSQL + pgvector)
pip install khora[sqlite-lance]   # [experimental] embedded SQLite + LanceDB
pip install khora[surrealdb]      # [experimental] unified SurrealDB (single store)
pip install khora[all-backends]   # everything: Neo4j, SurrealDB, SQLite+LanceDB, Weaviate, AGE

See docs/configuration.md for the full extras list.

Production stack

The production stack is PostgreSQL + pgvector + Neo4j:

  • VectorCypher (default engine) — runs on PostgreSQL + pgvector + Neo4j.
  • Chronicle — runs on PostgreSQL + pgvector (no graph DB required).
  • Skeleton — available; PostgreSQL + pgvector (no graph DB required).

Set KHORA_DATABASE_URL and KHORA_NEO4J_URL, run uv run alembic upgrade head, then instantiate Khora() with no arguments:

import asyncio
from khora import Khora

async def main() -> None:
    async with Khora() as kb:  # reads KHORA_DATABASE_URL / KHORA_NEO4J_URL
        ns = await kb.create_namespace()  # keyword-only kwargs; no positional name
        await kb.remember(
            "Marie Curie won the Nobel Prize in Physics in 1903.",
            namespace=ns.namespace_id,
        )
        result = await kb.recall("What did Curie win?", namespace=ns.namespace_id)
        print(result.context_text)

asyncio.run(main())

Batch processing

submit_batch() stages documents as PENDING and returns a BatchHandle immediately. A background processor picks them up and calls on_result per document as each completes.

The processor is opt-in. Call kb.start_pending_processor() after connect() on services that write documents. Read-only services do not need it. The processor can be stopped with await kb.stop_pending_processor() and restarted at any time.

async with Khora() as kb:
    kb.start_pending_processor()   # opt-in; write-path services only
    handle = await kb.submit_batch(
        [{"content": "doc 1"}, {"content": "doc 2"}],
        on_result=lambda completed, total, result: print(result),
        namespace=ns_id,
    )
    await handle.wait()

Embedded options (experimental)

Khora ships two zero-infrastructure paths. Both are marked experimental — fine for demos, evaluation, tests, and small single-user CLIs; not yet stamped as a deployment story.

  • SQLite + LanceDB (pip install khora[sqlite-lance], set KHORA_STORAGE_BACKEND=sqlite_lance) — recommended embedded stack. Covers VectorCypher, Skeleton, and Chronicle via dialect-aware Alembic migrations and LanceDB-backed vector search. Documented scale ceiling: ~1M chunks, ~100k entities, ~500k edges, traversal depth ≤3. Known gaps: no point-in-time queries, partial atomicity in coordinator.transaction(), FTS on chunks only. See configuration.md.
  • SurrealDB (pip install khora[surrealdb]) — unified relational + vector + graph in one store. Python SDK is on the alpha track (>=2.0.0a1), and KNN (<|K|>) is unreliable in embedded mode (uses brute-force cosine + HNSW fallback). Remote (WebSocket) mode supports atomic multi-statement transactions via conn.transaction() (v0.12.0); embedded / memory modes still operate per-statement-atomic. Suitable for experimentation; not recommended for production.

Quickstart caveat. A literal Khora("memory://") call passes "memory://" as the PostgreSQL URL, not as a backend selector — there is no memory:// URL scheme parsed by khora itself today. To use the embedded path, set KHORA_STORAGE_BACKEND=sqlite_lance (or surrealdb) and the corresponding db_path / connection settings.

Observability

khora emits OpenTelemetry spans and metrics through the OTel API. The export path is your choice: vanilla OTel SDK (pip install khora[otel]), Logfire (pip install khora[logfire]), or nothing (zero-cost no-op). Khora never installs a TracerProvider at import time and never sets service.name — those belong to the host application.

pip install khora[otel]
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
export OTEL_SERVICE_NAME="my-app"
from khora.telemetry import configure_telemetry
configure_telemetry()      # honors OTEL_* env vars

See docs/observability.md for the full env-var contract, the precedence rules, vendor recipes (Honeycomb, Datadog, Tempo, etc.), sampling guidance, and the troubleshooting checklist. The complete telemetry surface lives in docs/telemetry-contract.json with the drift gate enforced by tests/unit/telemetry/test_contract.py.

Two separate observability channels live in khora.telemetry:

  • Spans + metrics via the OTel API (this section).
  • Structured LLMEvent / StorageEvent / PipelineEvent rows to a dedicated PostgreSQL database when KHORA_TELEMETRY_DATABASE_URL is set. Without it, a NoOpCollector is used (zero cost). Wired by init_telemetry(), independent of configure_telemetry().

Credential fields on KhoraConfig (DSNs, passwords) are pydantic.SecretStrrepr() and config dumps render as '**********'. Callers that need the cleartext must call .get_secret_value() explicitly.

Async logging caveat. Library consumers that import khora without configuring loguru sinks inherit the default sync stderr sink, which blocks the event loop on every log call inside async def. Either call khora.logging_config.setup_logging() (which configures sinks with enqueue=True and registers an atexit drain) or configure your own loguru sinks with enqueue=True explicitly.

Documentation

Start at docs/README.md. Key entry points:

Development

make dev         # start PostgreSQL + Neo4j (Docker)
make test        # pytest with coverage
make format      # ruff format + isort
make lint        # ruff + ty typecheck

See CHANGELOG.md for release history.

License

Copyright 2026 AllTheData Inc.

Licensed under the Apache License, Version 2.0. See LICENSE and NOTICE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

khora-0.12.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

khora-0.12.0-py3-none-any.whl (795.1 kB view details)

Uploaded Python 3

File details

Details for the file khora-0.12.0.tar.gz.

File metadata

  • Download URL: khora-0.12.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for khora-0.12.0.tar.gz
Algorithm Hash digest
SHA256 171a6c828ecea4455aefdba2474b8c952bd699aaf13d43b9d39bb925d3c17977
MD5 a8269d4b5ced61f8c805e5db3d1a1eff
BLAKE2b-256 ac3f14b5b8c682100a8758886a280bb917c0fccb1a141d2cacb8098fc0e5efd5

See more details on using hashes here.

Provenance

The following attestation bundles were made for khora-0.12.0.tar.gz:

Publisher: release.yml on DeytaHQ/khora

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file khora-0.12.0-py3-none-any.whl.

File metadata

  • Download URL: khora-0.12.0-py3-none-any.whl
  • Upload date:
  • Size: 795.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for khora-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b52b25e37153981149eddca1476189eb86a11646f4d303dc2b0b4c4699a6c0d7
MD5 8a2539b54c8ad03fff151f9536edae25
BLAKE2b-256 c67bcaf0cf0ffdd95eda9cc6704bbf9579c6e3fc38a274a830a4a2040a96d043

See more details on using hashes here.

Provenance

The following attestation bundles were made for khora-0.12.0-py3-none-any.whl:

Publisher: release.yml on DeytaHQ/khora

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page