RAG pipeline for Claude Code — indexes your codebase and exposes it as an MCP server
Project description
ccrag
A RAG pipeline for Claude Code. Indexes your codebase locally and exposes it as an MCP server so Claude Code can semantically search your code during sessions.
How it works
ccrag index . → AST-chunks your code + embeds with sentence-transformers
stores vectors in .ccrag/ (LanceDB, stays in your repo)
ccrag serve . → MCP server (stdio) that Claude Code connects to
exposes search_codebase("how does auth work?") as a tool
Claude Code → automatically calls search_codebase when it needs context
gets back file paths, line ranges, and code snippets
Install
pip install ccrag
Usage
1. Index your codebase
cd /your/project
ccrag index .
2. Get the MCP config snippet
ccrag mcp-config .
This prints a JSON block to paste into .claude/settings.json:
{
"mcpServers": {
"ccrag": {
"command": "/path/to/ccrag",
"args": ["serve", "/your/project"]
}
}
}
3. Start a Claude Code session
Claude Code will now automatically call search_codebase whenever it needs to understand the codebase. No changes to your prompts needed.
Commands
| Command | Description |
|---|---|
ccrag index [PATH] |
Index or re-index the codebase |
ccrag index --force [PATH] |
Drop and rebuild the index |
ccrag serve [PATH] |
Start the MCP server (used by Claude Code) |
ccrag watch [PATH] |
Watch for file changes and re-index incrementally |
ccrag status [PATH] |
Show index stats (files, chunks) |
ccrag mcp-config [PATH] |
Print the settings.json snippet |
How the index works
- Chunking: Uses tree-sitter to split code at function/class/method boundaries — not arbitrary line windows. Falls back to line-window chunking for unsupported languages.
- Embeddings:
mixedbread-ai/mxbai-embed-large-v1viasentence-transformers— 1024-dim, top-tier MTEB retrieval, runs entirely locally with no API keys and no remote code. Override with--model <name>(e.g. a lighter model likeBAAI/bge-base-en-v1.5, or a code-specialized one likejinaai/jina-embeddings-v2-base-code, which needstrust_remote_code=Trueand a compatibletransformers). - Model cache: weights are cached in
.ccrag/models/on first run and reused offline afterward — downloaded once, never again. - Storage: LanceDB in
.ccrag/inside your project. Add.ccrag/to.gitignore. - Search: Cosine similarity over dense embeddings. Returns top-K chunks with file path, line range, language, and source.
Supported languages
Python, JavaScript, TypeScript, TSX, Go, Rust, Java, C, C++, Ruby, PHP, C#, Swift, Kotlin, Scala, Lua, Elixir, Haskell, OCaml, Bash, YAML, JSON, TOML, Markdown, and more.
MCP tools exposed
| Tool | Description |
|---|---|
search_codebase(query, n_results=8) |
Semantic search over indexed code |
codebase_stats() |
Number of indexed files and chunks |
Incremental updates
ccrag index . is incremental. It keeps a manifest of per-file content hashes in .ccrag/manifest.json and on each run:
- skips unchanged files entirely — no re-chunking, no re-embedding (the expensive step);
- re-embeds only changed/new files;
- prunes chunks for deleted files;
- rebuilds from scratch if you switch
--model(old vectors are incompatible).
So re-running it after editing a few files only touches those files:
$ ccrag index .
Found 20 files (1 new/changed, 19 unchanged, 0 removed)
Done. 3 chunks from 1 file(s) embedded, 0 file(s) removed.
For continuous updates, run the watcher, which re-indexes each file on save (and keeps the manifest in sync):
ccrag watch .
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ccrag-0.1.0.tar.gz.
File metadata
- Download URL: ccrag-0.1.0.tar.gz
- Upload date:
- Size: 18.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
74c6a321254d30233667e28a8285ea79df6b115b4ac7f0d7071225fe1b0cdeed
|
|
| MD5 |
d464b0051b63a39536e5a1715677ed4c
|
|
| BLAKE2b-256 |
bc6050777c600588d06c3c6a1f727add1dc688f2b6e07b0ebe7598b9962a2740
|
File details
Details for the file ccrag-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ccrag-0.1.0-py3-none-any.whl
- Upload date:
- Size: 13.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09f1ee27582ae8a59bdd28def52f6bd44d5ca7b612415f83030f04fc2b0c9229
|
|
| MD5 |
0a2972ed2651fe9c7a101f2731de7604
|
|
| BLAKE2b-256 |
9c0ba11eb1c0982840ce4bb2b3536ecdbaa269f03d4e6f05c75d1241e44064d4
|