Skip to main content

Lode: fully local repository knowledge graph daemon and CLI.

Project description

Lode

Lode is a fully local repository knowledge graph for coding agents.

It is a CLI-first, Docker-runnable code intelligence service. Lode indexes local repositories into a fast SQLite hot path and projects the same facts into embedded Kuzu for graph traversal. It is designed for agents that need quick answers to: where is this symbol, what is connected to it, what should I read, and what breaks if I change it?

Goals

  • Fully local: no accounts, no hosted control plane, no remote API calls by default.
  • Agent-native: compact JSON, bounded output, confidence labels, and file:line citations.
  • CLI first: agents call lode directly; MCP can be a thin compatibility shim later.
  • Hybrid storage: SQLite for exact/FTS hot-path lookup, Kuzu for graph/Cypher/vector workloads.
  • Docker on login: run loded as a local service and keep indexes fresh.

Current state

This is an early open-source MVP. It currently provides:

  • lode index PATH for Python, TypeScript/JavaScript, Markdown, and config-ish files.
  • lode search QUERY --json over SQLite FTS5.
  • lode symbol NAME --json for exact-ish symbol lookup.
  • lode context QUERY --json --budget N for an agent context pack.
  • lode neighbors NODE_ID --json for direct graph neighbors.
  • loded local HTTP daemon with /health, /status, /index, /search, and /context.
  • Optional Kuzu projection code when the kuzu extra is installed.
  • Docker Compose with a local TEI embeddings service using Snowflake/snowflake-arctic-embed-s.

Install

The PyPI package is lode-kg; it installs the lode CLI and loded daemon commands.

uv tool install lode-kg
lode --help

Or install with the optional embedded Kuzu projection support:

uv tool install 'lode-kg[kuzu]'

Quick start

lode index ~/Projects/lode
lode search "knowledge graph" --json
lode context "how does indexing work" --json --budget 4000

Run the local daemon:

loded --host 127.0.0.1 --port 7979

Use Docker Compose:

docker compose up -d --build
curl http://127.0.0.1:7979/health

The loded container runs as ${LODE_UID:-1000}:${LODE_GID:-1000} so its SQLite file stays writable by the host CLI. Export LODE_UID=$(id -u) and LODE_GID=$(id -g) first if your user is not UID/GID 1000.

Index a mounted repo through the daemon:

curl -sS -X POST http://127.0.0.1:7979/index \
  -H 'content-type: application/json' \
  -d '{"path":"/repos/lode"}' | jq

Architecture

agent / human
    |
  lode CLI
    |
localhost HTTP or direct DB
    |
  loded daemon
    |--------------------------|
    | scanner / parser         |
    | resolver                 |
    | context pack builder     |
    | embedding queue          |
    | Kuzu projector           |
    |--------------------------|
       |                   |
 SQLite hot index       Kuzu graph DB
       |                   |
       +---- fact projection ----+
               |
        TEI embeddings service

SQLite is the fast operational index. Kuzu is the graph analytics and Cypher projection. Facts should eventually be append-only and replayable so both projections can be rebuilt.

CLI commands

lode index PATH [--data-dir DIR] [--sync-kuzu]
lode status [--json]
lode search QUERY [--repo PATH] [--limit N] [--json]
lode symbol NAME [--repo PATH] [--limit N] [--json]
lode context QUERY [--repo PATH] [--budget N] [--json]
lode neighbors NODE_ID [--json]
lode kuzu-sync
lode embed [--limit N] [--url URL] [--model MODEL] [--json]
lode serve --host 127.0.0.1 --port 7979

kg and kgd are temporary aliases for lode and loded while the project is young.

Storage layout

Default data directory:

~/.local/share/lode/
  lode.sqlite
  lode.kuzu/

Embeddings

The Docker Compose file starts Hugging Face Text Embeddings Inference with Snowflake/snowflake-arctic-embed-s. It exposes /embed on 127.0.0.1:7980 for local smoke tests and wires the daemon with LODE_EMBEDDINGS_URL=http://embeddings:80. Embeddings are intentionally secondary to exact search and graph traversal.

Model choice: Exa/web research found snowflake-arctic-embed-s is the strongest 33M-parameter / 384-dimension small English retrieval model in its comparison set, with MTEB retrieval NDCG@10 of 51.98 versus 51.68 for BAAI/bge-small-en-v1.5. It also has ONNX artifacts and was smoke-tested successfully with TEI CPU /embed.

Embed queued nodes after indexing:

docker compose up -d embeddings
LODE_EMBEDDINGS_URL=http://127.0.0.1:7980 \
  LODE_EMBEDDINGS_MODEL=Snowflake/snowflake-arctic-embed-s \
  uv run lode embed --limit 32 --json

The first attempted default, Qwen/Qwen3-Embedding-0.6B, is not a safe TEI CPU default here: the container downloads the model, reports missing ONNX artifacts, falls back to Candle CPU warmup, and restarts before /embed serves. BAAI/bge-small-en-v1.5 works, but Snowflake/snowflake-arctic-embed-s is the current small-model default because it is the same size class and scored slightly better in the retrieved benchmark data.

Benchmarks

Latest local run: 2026-05-31 on an AMD Ryzen 9 8945HS, 16 logical cores, Python 3.13.9. Raw artifacts are under ignored bench-results/20260531T184011Z/. The SQLite hot path is the per-turn agent path; Kuzu sync is an optional batch/analytics projection.

Workload Files Nodes Edges Cold index Hot re-index Search p50 Symbol p50 Context p50 Neighbor p50 Kuzu sync Embeddings
Lode repo 19 814 1,166 161.970 ms 9.866 ms 0.348 ms 0.491 ms 2.038 ms 0.457 ms 8,587.591 ms 32 @ 24.3/s, 384d
Medium app 383 4,817 4,573 2,505.509 ms 43.303 ms 1.742 ms 4.187 ms 6.717 ms 1.162 ms 41,702.766 ms 32 @ 33.5/s, 384d
Larger app SQLite hot path 1,270 15,846 15,453 17,342.433 ms 95.348 ms 14.359 ms 15.739 ms 34.076 ms 3.437 ms n/a n/a

RepoBench-style retrieval, using the first 100 real rows from tianyang/repobench_python_v1.1 cross_file_first, scored retrieval-only quality:

Samples Mode Mean retrieval Hit@1 Hit@3 Hit@5 Hit@10 MRR
100 context 1.004 ms 0.13 0.48 0.56 0.56 0.2985

RepoBench is an ICLR 2024 benchmark for repository-level code completion. This adapter scores only whether Lode ranks the gold cross-file snippet path, not code generation.

Lode includes two benchmark entrypoints:

# Local operational benchmark: cold/hot index, search, symbols, context, graph, optional Kuzu
uv run python scripts/bench_lode.py --repo . --include-kuzu --json

# If TEI is running locally, include embedding throughput/persistence
docker compose up -d embeddings
uv run python scripts/bench_lode.py --repo . --embed-url http://127.0.0.1:7980 --json

For RepoBench-style retrieval quality, export a RepoBench split to JSONL and run the adapter:

# Expected fields match tianyang/repobench_python_v1.1: context, cropped_code, file_path, gold_snippet_index
uv run python benchmarks/repobench_adapter.py --input repobench_cross_file_first.jsonl --limit 100 --json

The adapter materializes each sample as a tiny repository and reports hit_at_k plus MRR for whether Lode ranks the gold cross-file snippet path. It is intended as a retrieval benchmark, not a code-generation benchmark.

License

Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lode_kg-0.1.0.tar.gz (36.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lode_kg-0.1.0-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file lode_kg-0.1.0.tar.gz.

File metadata

  • Download URL: lode_kg-0.1.0.tar.gz
  • Upload date:
  • Size: 36.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lode_kg-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8d30666dd0d6f00f2cca3e99a4bcc049d40345bbaf00ca597e851e9b96bed2a4
MD5 96924ff7a9802218555840433820ad84
BLAKE2b-256 043f5a98332f1242f9d30721b264c1b54fed702b654c5505a2dcd33ef3239a0e

See more details on using hashes here.

File details

Details for the file lode_kg-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: lode_kg-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lode_kg-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e181324879402410dfa599c3f28f1b7620ac43fa4b715ea402b40811f21d997
MD5 b79b46ab5173d93adf2f5bfbf669cbe5
BLAKE2b-256 66cd81454cf0b0d9f1be3c83b9cca1ad6d4afb91d9803de7af2d31da25d1c2e5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page