Skip to main content

X-ray your codebase — semantic search, code graphs, file skeletons, and MCP server

Project description

CodeRay — X-ray your codebase

PyPI License CI

CodeRay builds a local code index that gives AI agents a smarter way to explore a codebase — reading only what they need, not whole files.

Runs locally. No LLM. No network. No API key.

The problem

AI agents exploring a codebase default to reading whole files – even when one function is all that's needed. Every unnecessary line burns tokens and floods the context window: driving up API costs and noise with every read.

The root cause is simple: agents know the file paths but no finer location. Without knowing where in a file something lives, they have no choice but to read everything.

CodeRay fixes this. Every tool returns file paths with exact line ranges — so agents locate first, then read only the lines that matter.

How it works

CodeRay exposes three primitives, each returning paths + line ranges:

Tool Question it answers What agents get
search Where is the code that does X? Relevant chunks with file paths and line ranges
skeleton What's the shape of this file? Signatures + docstrings only, each tagged with its line range
impact What breaks if I change this? Callers, imports, and inheritors — located by line range

The two-phase flow

  1. Locate — run search, skeleton, or impact to find what's needed. Every result includes a file path and a symbol-level line range.
  2. Read precisely — use those line ranges to load only the relevant snippet. Skip the rest.

This keeps context windows lean and agent reasoning focused. CodeRay is not a replacement for grep — it fills the gap when exact names are unknown or a map is needed before reading.

Token savings (tiktoken, cl100k_base)

File Lines Full read Skeleton Savings % reduction
src/coderay/graph/impact.py 249 2,333 693 3.4× 70%
src/coderay/cli/commands.py 584 4,327 1,906 2.3× 56%
src/coderay/pipeline/indexer.py 408 3,065 1,433 2.1× 53%
Query Search hit tokens vs full indexer.py read
"how are files re-indexed on change" 479 ~6x cheaper

Tools

Semantic search

Agents search by meaning, not by name — useful when the exact function or class is unknown. Results return file paths with line ranges pointing at relevant chunks. Treat them as candidates: confirm with skeleton or a ranged read before acting. Keep the index fresh with coderay watch or coderay build when the tree drifts.

coderay search demo

Blast radius

Shows callers, imports, and inheritance for a symbol before it changes. Each result is tied to a file path and line range — combine with skeleton or ranged reads on those locations when bodies are needed.

coderay impact demo

Skeleton

Returns signatures and docstrings only — no function bodies. Every block is tagged with its path and line range so subsequent reads can be scoped to exactly those lines. A full file read should happen only when the skeleton isn't enough.

coderay skeleton demo

Full read

Same file, raw source — for comparison:

same file, raw source head

First run

coderay init and coderay build.

coderay init and build

Statuscoderay status: chunks, branch, commit, schema.

MCP

Same three tools over MCP: search, skeleton (paths and line ranges), and impact—so AI agents can narrow context before full-file reads. Point the server at a checkout whose root contains .coderay.toml (CODERAY_REPO_ROOT below). For tool choice versus a plain read, see AGENTS.md.

which coderay-mcp
{
  "mcpServers": {
    "coderay": {
      "command": "/path/to/.venv/bin/coderay-mcp",
      "args": [],
      "env": { "CODERAY_REPO_ROOT": "${workspaceFolder}" }
    }
  }
}

CODERAY_REPO_ROOT must be the directory that contains .coderay.toml. More detail: mcp_server/README.md.

Features

  • Languages — Python, JavaScript, and TypeScript — parsing/README.md
  • Multi-repo / monorepo — roots, aliases, optional include subtrees — core/README.md
  • Hybrid search — vector + BM25 (RRF), optional boosting — retrieval/README.md
  • Embeddings — fastembed (CPU) or MLX on Apple Silicon — embedding/README.md
  • Watch — incremental re-index; .coderay.toml is the source of truth for what’s indexed

Install

pipx (no venv):

brew install pipx && pipx install coderay   # macOS
# Linux: python3 -m pip install --user pipx && pipx install coderay

In a project:

python -m venv .venv && source .venv/bin/activate
pip install coderay
# Apple Silicon (optional): pip install "coderay[mlx]"

From source: pip install -e ".[all]" — see CONTRIBUTING.md.

Quick start

cd /path/to/your/project
coderay init
coderay watch
coderay search "how does authentication work"
coderay skeleton src/app/main.py
coderay impact some_symbol

CLI

Command Description
coderay init Create .coderay.toml and .coderay/
coderay watch [--quiet] Re-index on file changes
coderay build [--full] One-off or full rebuild
coderay search "query" Semantic search (--top-k, --path-prefix, --no-tests)
coderay skeleton FILE Signatures (--symbol)
coderay impact SYMBOL Blast radius (--max-depth)
coderay graph List edges (--from, --to, --kind)
coderay list Chunks or per-file summary
coderay status Index metadata
coderay maintain Compact LanceDB

Configuration

coderay init writes an annotated .coderay.toml: [index], [search], [graph], [embedder], [watcher]. See module READMEs linked from src/README.md.

Contributing

CONTRIBUTING.md

Accuracy and limitations

Semantic search is approximate (model and chunks matter). No warranty — MIT License. Evaluate on your own codebase.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coderay-1.2.0.tar.gz (96.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coderay-1.2.0-py3-none-any.whl (125.0 kB view details)

Uploaded Python 3

File details

Details for the file coderay-1.2.0.tar.gz.

File metadata

  • Download URL: coderay-1.2.0.tar.gz
  • Upload date:
  • Size: 96.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for coderay-1.2.0.tar.gz
Algorithm Hash digest
SHA256 7fce6357235cc8e620579108f6d959d0f6ffa13086d8e6ef4e8f9c6f67866b79
MD5 f41ce73e8cf43f20a6b455f5feda2e93
BLAKE2b-256 0b7859517d1600e703f382e985ab079f0b5ca65d4354a36d4e6d81afb0a6cec9

See more details on using hashes here.

Provenance

The following attestation bundles were made for coderay-1.2.0.tar.gz:

Publisher: release.yml on bogdan-copocean/coderay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coderay-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: coderay-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 125.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for coderay-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6109194de660418a478ae35934d6ff32b31d0b9ff382355ec8a80d5dd1dacb71
MD5 8f9abcacb362fff2dca5675f549492f1
BLAKE2b-256 7c7e6f63f8b82b6abf060f43649fdf78887f5efdce6b4abe0ae915c742acfe2e

See more details on using hashes here.

Provenance

The following attestation bundles were made for coderay-1.2.0-py3-none-any.whl:

Publisher: release.yml on bogdan-copocean/coderay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page