Skip to main content

X-ray your codebase — semantic search, code graphs, file skeletons, and MCP server

Project description

CodeRay — X-ray your codebase

PyPI License CI

CodeRay ships a local code index with semantic search, file skeletons (signatures and docstrings, no bodies), and blast radius (callers, imports, inheritance) — plus an MCP stdio server so agents can use the same tools. Ask by meaning, skim API shape, trace who calls what, then read implementation when it matters: fewer tokens, less noise, answers anchored to the right files.

No LLM inside CodeRay, no network, no API key – it runs on your machine.

Tools

CodeRay sits next to ripgrep, not instead of it. Ripgrep when you know the string or symbol; search, skeleton, and impact when you care about intent, structure, or dependencies—then open the file when you need real implementation detail.

Semantic search is retrieval, not proof: hits can miss or rank oddly. Treat them as candidates, confirm with a skeleton or read, and keep the index fresh with coderay watch or coderay build when things drift.

Skeleton shows API shape and docstrings, not every branch. Use search and impact to narrow where to look, then read the file (or spans) when you need control flow or line-accurate edits. CodeRay trims noise on those round trips; it does not forbid them.

Semantic search — “How/where” by meaning.

coderay search demo

Blast radius

Callers and dependents (calls, imports, inheritance).

coderay impact demo

Skeleton

Signatures and docstrings only; API surface without bodies.

coderay skeleton demo

Full read

Same file as skeleton: raw source costs more tokens.

same file, raw source head

First run

coderay init and coderay build.

coderay init and build

Statuscoderay status: chunks, branch, commit, schema.

MCP

Same tools as above, exposed to the agent so it can search, sketch structure, and trace impact instead of vacuuming whole files by default. Point the server at a checkout whose root contains .coderay.toml (CODERAY_REPO_ROOT below). For choosing tools versus a plain read, see AGENTS.md.

which coderay-mcp
{
  "mcpServers": {
    "coderay": {
      "command": "/path/to/.venv/bin/coderay-mcp",
      "args": [],
      "env": { "CODERAY_REPO_ROOT": "${workspaceFolder}" }
    }
  }
}

CODERAY_REPO_ROOT must be the directory that contains .coderay.toml. More detail: mcp_server/README.md.

Why this matters

Noisy context windows make models confident about the wrong code. CodeRay front-loads intent (search), shape (skeleton), and dependencies (impact) so the expensive read happens after you have a map—not instead of ever reading implementation when control flow matters.

Token savings (tiktoken, cl100k_base)

Measured on this repo after a full index.

File Lines Full read Skeleton Savings
src/coderay/pipeline/indexer.py 400 3,024 757 4.0x
src/coderay/graph/code_graph.py 500 4,261 1,022 4.2x
src/coderay/mcp_server/server.py 316 2,268 1,313 1.7x
Query Search hit tokens vs full indexer.py read
"how are files re-indexed on change" 479 ~6x cheaper

Not guarantees — model, chunks, and files affect counts.


Features

  • Languages — Python, JavaScript, and TypeScript — parsing/README.md
  • Multi-repo / monorepo — roots, aliases, optional include subtrees — core/README.md
  • Hybrid search — vector + BM25 (RRF), optional boosting — retrieval/README.md
  • Embeddings — fastembed (CPU) or MLX on Apple Silicon — embedding/README.md
  • Watch — incremental re-index; .coderay.toml is the source of truth for what’s indexed

Install

pipx (no venv):

brew install pipx && pipx install coderay   # macOS
# Linux: python3 -m pip install --user pipx && pipx install coderay

In a project:

python -m venv .venv && source .venv/bin/activate
pip install coderay
# Apple Silicon (optional): pip install "coderay[mlx]"

From source: pip install -e ".[all]" — see CONTRIBUTING.md.

Quick start

cd /path/to/your/project
coderay init
coderay watch
coderay search "how does authentication work"
coderay skeleton src/app/main.py
coderay impact some_symbol

CLI

Command Description
coderay init Create .coderay.toml and .coderay/
coderay watch [--quiet] Re-index on file changes
coderay build [--full] One-off or full rebuild
coderay search "query" Semantic search (--top-k, --path-prefix, --no-tests)
coderay skeleton FILE Signatures (--symbol)
coderay impact SYMBOL Blast radius (--max-depth)
coderay graph List edges (--from, --to, --kind)
coderay list Chunks or per-file summary
coderay status Index metadata
coderay maintain Compact LanceDB

Configuration

coderay init writes an annotated .coderay.toml: [index], [search], [graph], [embedder], [watcher]. See module READMEs linked from src/README.md.

Contributing

CONTRIBUTING.md

Accuracy and limitations

Semantic search is approximate (model and chunks matter). No warranty — MIT License. Evaluate on your own codebase.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coderay-1.1.0.tar.gz (89.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coderay-1.1.0-py3-none-any.whl (111.8 kB view details)

Uploaded Python 3

File details

Details for the file coderay-1.1.0.tar.gz.

File metadata

  • Download URL: coderay-1.1.0.tar.gz
  • Upload date:
  • Size: 89.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for coderay-1.1.0.tar.gz
Algorithm Hash digest
SHA256 a811dde83f7b448bc44a7239c43edca1551f1a41923b2b4916d232a3134536a6
MD5 2ccba42630be4c34fde75aa1964ab21c
BLAKE2b-256 7f2269704ee4fcb588806a83d5c48d017162982b82ff9b7e34240960aa003242

See more details on using hashes here.

File details

Details for the file coderay-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: coderay-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 111.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for coderay-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4d66d8fbdbaa5a758bfea8adec668c717fd355fca567e3830b35b57721ad2a62
MD5 aa35fca91db97803ed41fd28a99b57d7
BLAKE2b-256 45a34441f57f7e8b5a18b7a359de2a411e5fe4bfc4f4c6148865b9ad6917692e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page