X-ray your codebase — semantic search, code graphs, file skeletons, and MCP server
Project description
CodeRay — X-ray your codebase
CodeRay builds a local code index that gives AI agents a smarter way to explore a codebase — reading only what they need, not whole files.
Runs locally. No LLM. No network. No API key.
The problem
AI agents exploring a codebase default to reading whole files – even when one function is all that's needed. Every unnecessary line burns tokens and floods the context window: driving up API costs and noise with every read.
The root cause is simple: agents know the file paths but no finer location. Without knowing where in a file something lives, they have no choice but to read everything.
CodeRay fixes this. Every tool returns file paths with exact line ranges — so agents locate first, then read only the lines that matter.
How it works
CodeRay exposes three primitives, each returning paths + line ranges:
| Tool | Question it answers | What agents get |
|---|---|---|
| search | Where is the code that does X? | Relevant chunks with file paths and line ranges |
| skeleton | What's the shape of this file? | Signatures + docstrings only, each tagged with its line range |
| impact | What breaks if I change this? | Callers, imports, and inheritors — located by line range |
The two-phase flow
- Locate — run
search,skeleton, orimpactto find what's needed. Every result includes a file path and a symbol-level line range. - Read precisely — use those line ranges to load only the relevant snippet. Skip the rest.
This keeps context windows lean and agent reasoning focused. CodeRay is not a replacement for grep — it fills the gap when exact names are unknown or a map is needed before reading.
Token savings (tiktoken, cl100k_base)
| File | Lines | Full read | Skeleton | Savings | % reduction |
|---|---|---|---|---|---|
src/coderay/graph/impact.py |
249 | 2,333 | 693 | 3.4× | 70% |
src/coderay/cli/commands.py |
584 | 4,327 | 1,906 | 2.3× | 56% |
src/coderay/pipeline/indexer.py |
408 | 3,065 | 1,433 | 2.1× | 53% |
| Query | Search hit tokens | vs full indexer.py read |
|---|---|---|
| "how are files re-indexed on change" | 479 | ~6x cheaper |
Tools
Semantic search
Agents search by meaning, not by name — useful when the exact function or class is unknown. Results return file paths with line ranges pointing at relevant chunks. Treat them as candidates: confirm with skeleton or a ranged read before acting. Keep the index fresh with coderay watch or coderay build when the tree drifts.
Blast radius
Shows callers, imports, and inheritance for a symbol before it changes. Each result is tied to a file path and line range — combine with skeleton or ranged reads on those locations when bodies are needed.
Skeleton
Returns signatures and docstrings only — no function bodies. Every block is tagged with its path and line range so subsequent reads can be scoped to exactly those lines. A full file read should happen only when the skeleton isn't enough.
Full read
Same file, raw source — for comparison:
First run
coderay init and coderay build.
Status — coderay status: chunks, branch, commit, schema.
MCP
Same three tools over MCP: search, skeleton (paths and line ranges), and impact—so AI agents can narrow context before full-file reads. Point the server at a checkout whose root contains .coderay.toml (CODERAY_REPO_ROOT below). For tool choice versus a plain read, see AGENTS.md.
which coderay-mcp
{
"mcpServers": {
"coderay": {
"command": "/path/to/.venv/bin/coderay-mcp",
"args": [],
"env": { "CODERAY_REPO_ROOT": "${workspaceFolder}" }
}
}
}
CODERAY_REPO_ROOT must be the directory that contains .coderay.toml. More detail: mcp_server/README.md.
Features
- Languages — Python, JavaScript, and TypeScript —
parsing/README.md - Multi-repo / monorepo — roots, aliases, optional
includesubtrees —core/README.md - Hybrid search — vector + BM25 (RRF), optional boosting —
retrieval/README.md - Embeddings — fastembed (CPU) or MLX on Apple Silicon —
embedding/README.md - Watch — incremental re-index;
.coderay.tomlis the source of truth for what’s indexed
Install
pipx (no venv):
brew install pipx && pipx install coderay # macOS
# Linux: python3 -m pip install --user pipx && pipx install coderay
In a project:
python -m venv .venv && source .venv/bin/activate
pip install coderay
# Apple Silicon (optional): pip install "coderay[mlx]"
From source: pip install -e ".[all]" — see CONTRIBUTING.md.
Quick start
cd /path/to/your/project
coderay init
coderay watch
coderay search "how does authentication work"
coderay skeleton src/app/main.py
coderay impact some_symbol
CLI
| Command | Description |
|---|---|
coderay init |
Create .coderay.toml and .coderay/ |
coderay watch [--quiet] |
Re-index on file changes |
coderay build [--full] |
One-off or full rebuild |
coderay search "query" |
Semantic search (--top-k, --path-prefix, --no-tests) |
coderay skeleton FILE |
Signatures (--symbol) |
coderay impact SYMBOL |
Blast radius (--max-depth) |
coderay graph |
List edges (--from, --to, --kind) |
coderay list |
Chunks or per-file summary |
coderay status |
Index metadata |
coderay maintain |
Compact LanceDB |
Configuration
coderay init writes an annotated .coderay.toml: [index], [search], [graph], [embedder], [watcher]. See module READMEs linked from src/README.md.
Contributing
Accuracy and limitations
Semantic search is approximate (model and chunks matter). No warranty — MIT License. Evaluate on your own codebase.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file coderay-1.2.0.tar.gz.
File metadata
- Download URL: coderay-1.2.0.tar.gz
- Upload date:
- Size: 96.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7fce6357235cc8e620579108f6d959d0f6ffa13086d8e6ef4e8f9c6f67866b79
|
|
| MD5 |
f41ce73e8cf43f20a6b455f5feda2e93
|
|
| BLAKE2b-256 |
0b7859517d1600e703f382e985ab079f0b5ca65d4354a36d4e6d81afb0a6cec9
|
Provenance
The following attestation bundles were made for coderay-1.2.0.tar.gz:
Publisher:
release.yml on bogdan-copocean/coderay
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
coderay-1.2.0.tar.gz -
Subject digest:
7fce6357235cc8e620579108f6d959d0f6ffa13086d8e6ef4e8f9c6f67866b79 - Sigstore transparency entry: 1280639808
- Sigstore integration time:
-
Permalink:
bogdan-copocean/coderay@c659ff54f7922839675ff4b4e3a9f6fe53058a04 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/bogdan-copocean
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c659ff54f7922839675ff4b4e3a9f6fe53058a04 -
Trigger Event:
push
-
Statement type:
File details
Details for the file coderay-1.2.0-py3-none-any.whl.
File metadata
- Download URL: coderay-1.2.0-py3-none-any.whl
- Upload date:
- Size: 125.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6109194de660418a478ae35934d6ff32b31d0b9ff382355ec8a80d5dd1dacb71
|
|
| MD5 |
8f9abcacb362fff2dca5675f549492f1
|
|
| BLAKE2b-256 |
7c7e6f63f8b82b6abf060f43649fdf78887f5efdce6b4abe0ae915c742acfe2e
|
Provenance
The following attestation bundles were made for coderay-1.2.0-py3-none-any.whl:
Publisher:
release.yml on bogdan-copocean/coderay
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
coderay-1.2.0-py3-none-any.whl -
Subject digest:
6109194de660418a478ae35934d6ff32b31d0b9ff382355ec8a80d5dd1dacb71 - Sigstore transparency entry: 1280639812
- Sigstore integration time:
-
Permalink:
bogdan-copocean/coderay@c659ff54f7922839675ff4b4e3a9f6fe53058a04 -
Branch / Tag:
refs/tags/v1.2.0 - Owner: https://github.com/bogdan-copocean
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@c659ff54f7922839675ff4b4e3a9f6fe53058a04 -
Trigger Event:
push
-
Statement type: