Skip to main content

X-ray your codebase — semantic search, code graphs, file skeletons, and MCP server

Project description

CodeRay — X-ray your codebase

PyPI License CI

CodeRay builds a local code index that gives AI agents a smarter way to explore a codebase — reading only what they need, not whole files.

Runs locally. No LLM. No network. No API key.

The problem

AI agents exploring a codebase default to reading whole files – even when one function is all that's needed. Every unnecessary line burns tokens and floods the context window: driving up API costs and noise with every read.

The root cause is simple: agents know the file paths but no finer location. Without knowing where in a file something lives, they have no choice but to read everything.

CodeRay fixes this. Every tool returns file paths with exact line ranges — so agents locate first, then read only the lines that matter.

How it works

CodeRay exposes three primitives, each returning paths + line ranges:

Tool Question it answers What agents get
search Where is the code that does X? Relevant chunks with file paths and line ranges
skeleton What's the shape of this file? Signatures + docstrings only, each tagged with its line range
impact What breaks if I change this? Callers, imports, and inheritors — located by line range

The two-phase flow

  1. Locate — run search, skeleton, or impact to find what's needed. Every result includes a file path and a symbol-level line range.
  2. Read precisely — use those line ranges to load only the relevant snippet. Skip the rest.

This keeps context windows lean and agent reasoning focused. CodeRay is not a replacement for grep — it fills the gap when exact names are unknown or a map is needed before reading.

Token savings (tiktoken, cl100k_base)

File Lines Full read Skeleton Savings % reduction
src/coderay/graph/impact.py 249 2,333 693 3.4× 70%
src/coderay/cli/commands.py 584 4,327 1,906 2.3× 56%
src/coderay/pipeline/indexer.py 408 3,065 1,433 2.1× 53%
Query Search hit tokens vs full indexer.py read
"how are files re-indexed on change" 479 ~6x cheaper

Tools

Semantic search

Agents search by meaning, not by name — useful when the exact function or class is unknown. Results return file paths with line ranges pointing at relevant chunks. Treat them as candidates: confirm with skeleton or a ranged read before acting. Keep the index fresh with coderay watch or coderay build when the tree drifts.

coderay search demo

Blast radius

Shows callers, imports, and inheritance for a symbol before it changes. Each result is tied to a file path and line range — combine with skeleton or ranged reads on those locations when bodies are needed.

coderay impact demo

Skeleton

Returns signatures and docstrings only — no function bodies. Every block is tagged with its path and line range so subsequent reads can be scoped to exactly those lines. A full file read should happen only when the skeleton isn't enough.

coderay skeleton demo

Full read

Same file, raw source — for comparison:

same file, raw source head

First run

coderay init and coderay build.

coderay init and build

Statuscoderay status: chunks, branch, commit, schema.

MCP

Same three tools over MCP: search, skeleton (paths and line ranges), and impact—so AI agents can narrow context before full-file reads. Point the server at a checkout whose root contains .coderay.toml (CODERAY_REPO_ROOT below). For tool choice versus a plain read, see AGENTS.md.

which coderay-mcp
{
  "mcpServers": {
    "coderay": {
      "command": "/path/to/.venv/bin/coderay-mcp",
      "args": [],
      "env": { "CODERAY_REPO_ROOT": "${workspaceFolder}" }
    }
  }
}

CODERAY_REPO_ROOT must be the directory that contains .coderay.toml. More detail: mcp_server/README.md.

Features

  • Languages — Python, JavaScript, and TypeScript — parsing/README.md
  • Multi-repo / monorepo — roots, aliases, optional include subtrees — core/README.md
  • Hybrid search — vector + BM25 (RRF), optional boosting — retrieval/README.md
  • Embeddings — fastembed (CPU) or MLX on Apple Silicon; defaults to MiniLM L6 for speed — configure BGE in .coderay.toml for stronger (heavier) vectors — embedding/README.md
  • Watch — incremental re-index; .coderay.toml is the source of truth for what’s indexed

Install

pipx (no venv):

brew install pipx && pipx install coderay   # macOS
# Linux: python3 -m pip install --user pipx && pipx install coderay

In a project:

python -m venv .venv && source .venv/bin/activate
pip install coderay
# Apple Silicon (optional): pip install "coderay[mlx]"

From source: pip install -e ".[all]" — see CONTRIBUTING.md.

Quick start

cd /path/to/your/project
coderay init
coderay watch
coderay search "how does authentication work"
coderay skeleton src/app/main.py
coderay impact some_symbol

CLI

Command Description
coderay init Create .coderay.toml and .coderay/
coderay watch [--quiet] Re-index on file changes
coderay build [--full] One-off or full rebuild
coderay search "query" Semantic search (--top-k, --path-prefix, --no-tests)
coderay skeleton FILE Signatures (--symbol)
coderay impact SYMBOL Blast radius (--max-depth)
coderay graph List edges (--from, --to, --kind)
coderay list Chunks or per-file summary
coderay status Index metadata
coderay maintain Compact LanceDB

Configuration

coderay init writes an annotated .coderay.toml: [index], [search], [graph], [embedder], [watcher]. See module READMEs linked from src/README.md.

Contributing

CONTRIBUTING.md

Accuracy and limitations

Semantic search is approximate (model and chunks matter). No warranty — MIT License. Evaluate on your own codebase.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coderay-1.2.2.tar.gz (96.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coderay-1.2.2-py3-none-any.whl (125.1 kB view details)

Uploaded Python 3

File details

Details for the file coderay-1.2.2.tar.gz.

File metadata

  • Download URL: coderay-1.2.2.tar.gz
  • Upload date:
  • Size: 96.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for coderay-1.2.2.tar.gz
Algorithm Hash digest
SHA256 3502553487df21a82d1cb731b38d9a92561bfd658639d84b5841d701052cda26
MD5 0de4b5eabd66c936f258c27a4713b590
BLAKE2b-256 64791fc3fb3e2c85fde19fe6f9bbf2c4714ff964ff8605b91b0217847498e02c

See more details on using hashes here.

Provenance

The following attestation bundles were made for coderay-1.2.2.tar.gz:

Publisher: release.yml on bogdan-copocean/coderay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coderay-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: coderay-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 125.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for coderay-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 61439bd88c2ef755b7c20579cb6410b16cd53db45342da5a6ce1b4983e39d468
MD5 b3a4ca531d02c91f9f0e04034aadee39
BLAKE2b-256 6032e903c194359a443ec9b11abbbb74b07068276dec07ae43301c443dd6062b

See more details on using hashes here.

Provenance

The following attestation bundles were made for coderay-1.2.2-py3-none-any.whl:

Publisher: release.yml on bogdan-copocean/coderay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page