Skip to main content

SQLite-backed code index for Claude Code, exposed via MCP

Project description

code-index

A SQLite-backed code index for Claude Code, exposed via MCP. Replaces exploratory Read/Grep/Glob calls with targeted retrieval.

What it does

  • Parses your repo with tree-sitter (Python, TypeScript/JavaScript, Go, Rust).
  • Chunks per symbol; expands identifiers (getUserAuthTokenget user auth token).
  • Embeds locally with jina-embeddings-v2-base-code (768d) via sentence-transformers — no API key, no external services.
  • Stores symbols, chunks, vectors, and call/import edges in .claude/index.db.
  • Serves retrieval over MCP — 8 retrieval tools + 2 admin tools (see below).
  • Auto-updates via a Claude Code PostToolUse hook and an optional file watcher.

Tools

Tool Purpose
init Build or refresh the project's index. Incremental by default; force=true rebuilds from scratch.
setup_check Check whether the auto-reindex hook is wired in .claude/settings.json. Returns install instructions if not.
code_search Hybrid (vector + FTS) search for conceptual queries (e.g., "auth flow", "where do we parse JSON").
symbol_lookup Exact-name lookup of functions / classes / methods / types. Prefer over code_search for identifiers.
file_outline Symbols (with signatures) in a file, in source order. Use instead of Read when you only need shape.
get_symbol_body Full chunk for a symbol_id returned by symbol_lookup or code_search.
callers Symbols that CALL the given symbol. depth (1-5) expands transitively.
callees Symbols that the given symbol CALLS. depth (1-5) expands transitively.
dependents Files that import the given file.
dependencies Files that the given file imports.

All tools return bounded JSON; large bodies use get_symbol_body rather than inlining whole files.

Requirements

Python with loadable SQLite extension support (required by sqlite-vec). Python 3.13 has this enabled by default. For 3.10–3.12, use either:

  • the python.org installer, or
  • pyenv: PYTHON_CONFIGURE_OPTS=--enable-loadable-sqlite-extensions pyenv install 3.12.x

Install

In Claude Code (primary)

Preview note: the commands below include uvx --refresh so the latest published version is fetched from PyPI on every Claude Code launch. If you already installed without it, run claude mcp remove code-index first and then re-run the install command. Drop --refresh once you want to pin to a stable version (cuts ~1s off startup).

One command, no API keys:

claude mcp add-json -s user code-index "$(cat <<'JSON'
{
  "type": "stdio",
  "command": "uvx",
  "args": ["--refresh", "--from", "mcp-code-index", "code-index-mcp"]
}
JSON
)"

Drop -s user to register only in the current project (writes to .claude/settings.json instead of ~/.claude.json).

First-run note: the first init downloads jina-embeddings-v2-base-code (~600 MB) into ~/.cache/huggingface. Subsequent runs are fully offline. If your environment blocks Hugging Face downloads, the first index will fail with a network error — pre-warm the cache from a machine that has access.

That's it. Open Claude Code in any repo and ask:

"Build the code index for this repo."

Claude calls the init MCP tool, which writes .claude/index.db for that project. Subsequent prompts can use code_search, symbol_lookup, callers, etc. — see Tools above for the full surface.

Or, with a permanent install (no uvx)

pip install mcp-code-index
claude mcp add -s user code-index -- code-index-mcp

Optional: keep the index live as you edit

Without a hook, the index drifts when files change outside the agent (mv, git checkout, IDE saves) until you call init again. With one, every Edit / Write / MultiEdit Claude performs triggers an incremental reindex of the touched file.

Easiest path: ask Claude. On first use in a new project, ask "is the auto-reindex hook installed?" — Claude calls setup_check, sees the gap, and offers to wire it up.

Manual install — add this block to the project's .claude/settings.json under hooks.PostToolUse:

{
  "matcher": "Edit|Write|MultiEdit",
  "hooks": [
    {
      "type": "command",
      "command": "python -c 'import json,os,subprocess,sys; p=json.load(sys.stdin) if not sys.stdin.isatty() else {}; fp=(p.get(\"tool_input\") or {}).get(\"file_path\"); cwd=p.get(\"cwd\") or os.getcwd(); fp and subprocess.Popen([\"code-index\",\"--root\",cwd,\"reindex\",\"--file\",fp],stdout=subprocess.DEVNULL,stderr=subprocess.DEVNULL,stdin=subprocess.DEVNULL,cwd=cwd,start_new_session=True)'"
    }
  ]
}

In other MCP-compatible agents

The server speaks standard MCP over stdio, so any client that supports MCP servers works (Cursor, Continue, Cody, Zed, etc.). Configure the client to launch uvx --refresh --from mcp-code-index code-index-mcp (or code-index-mcp after pip install mcp-code-index). Once connected, call the init tool from inside the client to bootstrap the index. Drop --refresh when you want to pin to a stable version instead of always pulling latest.

From source (development)

git clone https://github.com/achreftlili/code-index
cd code-index
pip install -e .
code-index init        # CLI alternative to the `init` MCP tool
code-index-mcp         # starts the MCP server on stdio (for manual wiring)

Configuration

Environment variables:

Var Default Notes
CODE_INDEX_DB .claude/index.db SQLite path.
CODE_INDEX_EMBEDDER jina Only jina is supported (local sentence-transformers); the variable exists for future expansion.
CODE_INDEX_EMBED_MODEL jinaai/jina-embeddings-v2-base-code HuggingFace model id. Override only if dim-compatible.
CODE_INDEX_EMBED_DIM 768 Must match the model.
CODE_INDEX_EMBED_BATCH 32 Encode batch size. Lower (e.g. 8 or 4) if the GPU OOMs on large files.
CODE_INDEX_EMBED_DEVICE auto Torch device override: cpu, mps, cuda. Set cpu to avoid Apple-Silicon MPS OOMs.

Layout

src/code_index/
  db.py           SQLite schema, connection, sqlite-vec loading
  parser.py       Tree-sitter wrapper, symbol + edge extraction
  chunker.py      Per-symbol chunks, identifier expansion
  embedder.py     Local Jina (sentence-transformers) backend
  indexer.py      Pipeline: walk → parse → chunk → embed → write
  retriever.py    Hybrid search (vector + FTS5) with RRF
  watcher.py      File watcher (watchdog)
  mcp_server.py   10 MCP tools (8 retrieval + init/setup_check admin)
  cli.py          init / reindex / watch / stats

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_code_index-0.3.4.tar.gz (38.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_code_index-0.3.4-py3-none-any.whl (34.3 kB view details)

Uploaded Python 3

File details

Details for the file mcp_code_index-0.3.4.tar.gz.

File metadata

  • Download URL: mcp_code_index-0.3.4.tar.gz
  • Upload date:
  • Size: 38.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for mcp_code_index-0.3.4.tar.gz
Algorithm Hash digest
SHA256 d86f650dcf08fc1a8081eac7e1b5726dd4cfa470c321442856b1dd811ef798e8
MD5 c4f8bde9d0413c67229ea0d436dac960
BLAKE2b-256 1cc4b2f7690b614cd9c26ff7788b71b00aca14ffb35ac599bd32ae93ef54945f

See more details on using hashes here.

File details

Details for the file mcp_code_index-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: mcp_code_index-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 34.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for mcp_code_index-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e2f015ad6502833634ecec2b773d90ae960f74a177b03123639b0c63e0e931d3
MD5 c9abbd23b7b2203f0150adb2cb3cd4e8
BLAKE2b-256 7ee81269a5f386b0514a7b8aa0dbef810fccc61d16b58118125b83e93fd5afab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page