Skip to main content

Intent-aware code-knowledge index for LLM agents, built on intent-db: index a repo once, query it by intent, cut the token cost of re-reading files.

Project description

intent-code

PyPI CI Python License: MIT

An intent-aware code-knowledge index for LLM agents, built on intent-db.

Index a repository once, then query it by intent so an agent reads only what it needs instead of re-reading the whole repo every session. The same search returns different results for the agent's current phase: debugging, extending, reviewing, or onboarding.

Why

LLM coding agents re-read files every session because they have no durable, queryable understanding of a codebase. intent-code builds that understanding once and keeps it fresh incrementally, so the expensive "read and synthesize" happens once per change and is reused across sessions.

It combines three established ideas:

  • A durable, LLM-maintained knowledge layer of gotcha and flow notes, in the spirit of Karpathy's LLM Wiki.
  • A tree-sitter symbol and dependency map ranked by PageRank, in the spirit of Aider's repo map.
  • Incremental indexing: only changed code is re-embedded, in the spirit of Cursor's Merkle indexing.

The retrieval engine is intent-db: documents are embedded once and a per-intent lens re-ranks results at query time, with a feedback loop that learns per-intent ranking from what the agent actually used.

How an agent uses it

One index, three ways to consume it (the same data, regenerated on every index):

  1. MCP (primary) for Claude Code and any tool-capable agent: code_map, code_search, code_read, code_context, code_flow, code_neighbors, note_put / note_get / note_list_stale, code_feedback, code_index. code_search finds where something is; code_read / code_context / code_flow answer how it works by returning full bodies and the call sequence, so the agent stops re-reading whole files.
  2. Committed markdown under docs/codemap/ (MAP.md, index.md, notes/): readable by any LLM or human, even without MCP, straight from a git clone.
  3. CLI --json for any agent that can run a shell command.

Install

uv tool install intent-code          # or: pipx install intent-code

Quickstart

cd your-repo
intent-code init .                   # builds the index, writes .mcp.json + protocol
# restart Claude Code -> the "code" MCP tools are available

Or use it directly:

intent-code index .
intent-code search "where is the retry handled" --intent debugging
intent-code map
intent-code neighbors your.module.Class.method --direction callers

# understand how code works, not just where it is
intent-code read your.module.handle_request           # full body, untruncated
intent-code context your.module.handle_request        # it + its callees, in call order
intent-code flow your.module.handle_request           # the ordered call sequence

Claude Code plugin

/plugin marketplace add harsharahul/intent-code
/plugin install intent-code

The plugin wires the MCP server, the /code-index and /code-note commands, and an optional PostToolUse freshness hook.

Use with GitHub Copilot or Gemini CLI

The same MCP server works with any agent that speaks MCP. init writes each agent's native config and the protocol into its instruction file:

intent-code init . --agent copilot   # .vscode/mcp.json + .github/copilot-instructions.md
intent-code init . --agent gemini    # .gemini/settings.json + GEMINI.md
intent-code init . --agent all        # claude + copilot + gemini at once

Writes are idempotent: instruction files get a marked managed block and JSON configs are key-merged, so re-running never clobbers your own content. The files init generates are excluded from the index, so the protocol is never indexed back into itself.

Keeping the index fresh

Freshness does not require git. Re-indexing is incremental (only changed files re-parse), and you can trigger it three ways, in order of automation:

  • Claude Code: the plugin's PostToolUse hook marks edited files dirty.
  • Any agent: the protocol tells it to call code_index after edits.
  • Git repos: intent-code install-hooks . adds commit/merge/checkout hooks that flag the index stale, so the next query re-indexes on its own.

Indexing multiple repositories

One index lives at the path you point init/serve-mcp at, and the walk skips every nested .git/ at any depth. So:

  • Point at a single repo for that repo's index (run init inside each repo for isolated indexes).
  • Point at a parent folder of several repos for one combined, cross-repo index.

Git hooks are per-repository, so the combined-parent case relies on the code_index path for freshness rather than hooks.

How it works

The index is a single intent-db SQLite file under .intentdb/ (add it to your .gitignore). Documents are tagged by layer:

  • symbol: a signature card per function/class/method (tree-sitter), with line span, content hash, and import/call edges.
  • chunk: AST-aware chunks for text or grammar-less files.
  • note: durable, human-authored gotcha and flow articles.

A dependency graph and PageRank ranking are derived from the symbol edges to produce the repo map and neighbors tracing. Re-indexing hashes each file and re-embeds only the symbols whose content changed.

The default embedder auto-detects: a local Ollama model (nomic-embed-text) if reachable, otherwise the zero-dependency hashing embedder, with BM25 hybrid search always on for exact symbol matches.

Benchmark

A token-spend benchmark ships in intent_code.eval. On the intent-db codebase, answering a five-question set used about 97% fewer input tokens than reading whole files to reach the answer (roughly 7.8k versus 227k), with the zero-dependency hashing embedder. A local embedding model improves which questions land in the top results.

python -m intent_code.eval.run /path/to/repo

Security and supply chain

Minimal runtime dependencies, official tree-sitter grammars, version bounds plus a hash-pinned lockfile, dependency auditing in CI, SHA-pinned GitHub Actions, and PyPI trusted publishing. See SECURITY.md.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

intent_code-0.2.1.tar.gz (45.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

intent_code-0.2.1-py3-none-any.whl (40.7 kB view details)

Uploaded Python 3

File details

Details for the file intent_code-0.2.1.tar.gz.

File metadata

  • Download URL: intent_code-0.2.1.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for intent_code-0.2.1.tar.gz
Algorithm Hash digest
SHA256 471be9f3de34547ff6a7879ebfd666e889219ab74294fc3ea8e5f4e804682296
MD5 7dc3e6c3f027b2678ac32a1710472626
BLAKE2b-256 715c3bf0dda94319c7832dd68792f67d6063396012fa64b46d07fe66d94c38c5

See more details on using hashes here.

Provenance

The following attestation bundles were made for intent_code-0.2.1.tar.gz:

Publisher: release.yml on harsharahul/intent-code

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file intent_code-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: intent_code-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 40.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for intent_code-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 50f2a53851b5ede787041bfd2433bd86c1bd23ec217a2f19fadd9bc6bb629b1a
MD5 e33b46940fd29686ad6623a121377260
BLAKE2b-256 a377ccfd82aa7df3237bca7cf570e12020e0032c70652c0cffd260a79a04b9f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for intent_code-0.2.1-py3-none-any.whl:

Publisher: release.yml on harsharahul/intent-code

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page