Build knowledge graphs from your codebase for LLM context retrieval — supports PHP, JavaScript, and TypeScript with framework detection and MCP server integration.
Project description
🧠 CodeRAG
Build knowledge graphs from your codebase for smarter AI coding assistants
CodeRAG parses your codebase using tree-sitter AST analysis, builds a rich knowledge graph of symbols and relationships, and serves that intelligence to AI coding assistants via an MCP server. It understands classes, functions, routes, components, cross-language connections, and framework patterns — giving your AI tools deep structural awareness instead of naive file reading.
✨ Key Features
- 🌐 7 Languages — PHP, JavaScript, TypeScript, Python, CSS, SCSS + Vue SFC
- 🏗️ 11 Framework Detectors — Laravel, Symfony, React, Express, Next.js, Vue, Angular, Django, Flask, FastAPI, Tailwind CSS
- 🤖 MCP Server — 16 tools for Claude Code, Cursor, and Codex CLI integration
- 📊 Graph Analysis — PageRank, community detection, blast radius, dependency graphs
- 🔍 Hybrid Search — FTS5 full-text + FAISS vector semantic search
- 🔗 Cross-Language — Matches PHP routes to JS fetch calls, Python APIs to TS clients
- 💰 86% Token Savings — Proven cost reduction in AI coding sessions
- 🧠 Session Memory — Cross-session context persistence
- 🐳 Docker Ready — One-command deployment
- 📈 Battle-Tested — 7 dogfood sessions, 17 bugs found & fixed, 255K+ nodes parsed
🚀 Quick Start
Prerequisites
- Python 3.11+
- pip
Installation
# Clone and install
git clone https://github.com/dmnkhorvath/coderag.git
cd coderag
pip install -e '.[all]'
Or use the automated installer:
curl -fsSL https://raw.githubusercontent.com/dmnkhorvath/coderag/main/install-coderag.sh | sh
Parse a Codebase
# Parse your project
coderag parse /path/to/project
# Explore the graph
coderag info
coderag query MyClass
coderag find-usages UserController
coderag architecture
Launch with AI
# One command — parse, configure MCP, launch AI tool
coderag launch /path/to/project --tool claude-code
# Or start the MCP server standalone
coderag serve /path/to/project --watch
🔬 How It Works
- Parse — Tree-sitter AST extracts symbols (classes, functions, routes, components) across 7 languages
- Resolve — Cross-file references resolved with 4-strategy matching (exact, suffix, short name, placeholder)
- Serve — MCP server provides graph intelligence to AI coding tools with 16 specialized tools
Your Codebase CodeRAG AI Tool
┌──────────┐ ┌─────────────────┐ ┌──────────────────┐
│ .php │ │ Tree-sitter AST │ │ Claude Code │
│ .ts/.js │───▶│ Knowledge Graph │───▶│ Cursor │
│ .py │ │ MCP Server │ │ Codex CLI │
│ .css │ └─────────────────┘ └──────────────────┘
└──────────┘
🌐 Supported Languages & Frameworks
| Language | Extensions | Node Types | Framework Detectors |
|---|---|---|---|
| PHP | .php, .blade.php |
class, function, method, trait, interface, enum, constant | Laravel, Symfony |
| JavaScript | .js, .jsx, .mjs, .cjs |
class, function, arrow_function, component, export | Express, React |
| TypeScript | .ts, .tsx, .mts, .cts, .vue |
class, function, interface, type_alias, enum, component | Next.js, Vue, Angular |
| Python | .py |
class, function, method, decorator | Django, Flask, FastAPI |
| CSS | .css |
selector, variable, keyframes, media_query, layer | Tailwind CSS |
| SCSS | .scss |
selector, variable, mixin, function, placeholder | — |
📋 CLI Reference
| Command | Description |
|---|---|
coderag parse <path> |
Parse codebase and build knowledge graph |
coderag query <symbol> |
Search for symbols by name |
coderag info [path] |
Show graph statistics |
coderag find-usages <symbol> |
Find all usages of a symbol |
coderag deps <symbol> |
Show dependency graph |
coderag analyze <symbol> |
Deep analysis with PageRank and centrality |
coderag architecture |
Show architecture overview |
coderag frameworks |
Show detected frameworks |
coderag cross-language |
Show cross-language connections |
coderag routes |
List all detected routes |
coderag impact <symbol> |
Blast radius analysis |
coderag file-context <file> |
Get context for a specific file |
coderag export |
Export graph in markdown/json/tree format |
coderag enrich |
Enrich graph with git history |
coderag watch <path> |
Watch for file changes and auto-reparse |
coderag serve |
Start MCP server for AI tools |
coderag launch <path> |
Smart launcher — parse, configure, launch AI tool |
coderag visualize <path> |
Generate interactive D3.js graph visualization |
coderag benchmark <path> |
Token cost benchmarking |
coderag session list |
List coding sessions |
coderag session show <id> |
Show session details |
coderag session context |
Show session context |
coderag update check |
Check for updates |
coderag update install |
Install latest version |
coderag init |
Initialize configuration |
🤖 MCP Server
CodeRAG includes a Model Context Protocol (MCP) server that exposes the knowledge graph to AI coding assistants. Start it with coderag serve or automatically via coderag launch.
Code Intelligence Tools (8)
| Tool | Description |
|---|---|
coderag_lookup_symbol |
Look up a symbol — definition, relationships, and context |
coderag_find_usages |
Find all usages — calls, imports, extensions, implementations |
coderag_impact_analysis |
Blast radius analysis for a symbol or file change |
coderag_file_context |
Get full context for a file — symbols, relationships, importance |
coderag_find_routes |
Find API routes and their frontend callers |
coderag_search |
Full-text and semantic search across the knowledge graph |
coderag_architecture |
High-level architecture overview with key metrics |
coderag_dependency_graph |
Dependency graph for a symbol or file |
Session Memory Tools (8)
| Tool | Description |
|---|---|
session_log_read |
Log a file read event |
session_log_edit |
Log a file edit event |
session_log_decision |
Log an architectural decision |
session_log_task |
Log a task completion |
session_log_fact |
Log a discovered fact |
session_get_history |
Get session history |
session_get_hot_files |
Get frequently accessed files |
session_get_context |
Get session context for pre-loading |
Resources (3)
| Resource | Description |
|---|---|
coderag://summary |
Project summary with key metrics |
coderag://architecture |
Architecture overview |
coderag://file-map |
Complete file map of the project |
AI Tool Configuration
Claude Code — auto-configured via coderag launch --tool claude-code:
{
"mcpServers": {
"coderag": {
"command": "coderag",
"args": ["serve", ".", "--watch"]
}
}
}
Cursor — auto-configured via coderag launch --tool cursor:
{
"mcpServers": {
"coderag": {
"command": "coderag",
"args": ["serve", ".", "--watch"]
}
}
}
Codex CLI — auto-configured via coderag launch --tool codex:
{
"mcpServers": {
"coderag": {
"command": "coderag",
"args": ["serve", ".", "--watch"]
}
}
}
🐳 Docker
# Quick start
docker compose up -d
Services
| Service | Description | Port |
|---|---|---|
cli |
CodeRAG CLI for parsing and queries | — |
mcp |
MCP server for AI tool integration | 3000 |
watcher |
File watcher for auto-reparse | — |
See docs/docker.md for full Docker documentation.
📊 Performance Benchmarks
Automated Benchmarks (51 repositories)
| Category | Repos | Files | Nodes | Edges |
|---|---|---|---|---|
| PHP | 10 | 33,896 | 516,705 | 1,359,239 |
| JavaScript | 8 | 12,093 | 119,811 | 219,490 |
| TypeScript | 7 | 37,544 | 232,153 | 542,491 |
| Python | 10 | 5,968 | 206,563 | 448,830 |
| Mixed (PHP+JS/TS) | 4 | 8,212 | 81,178 | 185,371 |
| CSS/SCSS/Tailwind | 12 | 14,096 | 142,387 | 406,729 |
| Total | 51 | 111,809 | ~1,298,797 | ~3,162,150 |
Real-World Dogfood Sessions (7 sessions)
| Session | Project | Type | Files | Nodes | Edges | Bugs Fixed |
|---|---|---|---|---|---|---|
| 1 | koel | PHP + Vue | 1,592 | 13,384 | 36,709 | 2 |
| 2 | paperless-ngx | Django + Angular | 807 | 20,580 | 50,517 | 0 |
| 3 | saleor | Django + GraphQL | 4,220 | 111,076 | 260,654 | 3 |
| 4 | NocoDB | TypeScript + Vue | 1,823 | 24,367 | 74,284 | 4 |
| 5 | Cal.com | TS + React + Tailwind | 7,530 | 50,926 | 220,752 | 4 |
| 6 | koel (MCP) | MCP + Claude Code | — | 15,797 | 34,294 | 1 |
| 7 | Mealie | FastAPI + Vue | 1,008 | 18,895 | 53,380 | 2 |
| Total | 16,980 | 255,025 | 730,590 | 17 |
Token Cost Savings
| Metric | Without CodeRAG | With CodeRAG | Savings |
|---|---|---|---|
| Avg tokens/task | 17,617 | 2,400 | 86.4% |
| Monthly cost (Claude Sonnet) | $42.28 | $5.76 | $36.52 |
🧪 Testing
- 4,563 tests passing
- 94% code coverage on critical modules (67% overall)
- 18 E2E integration tests (full workflow validation)
- CI/CD via GitHub Actions — Python 3.11, 3.12, 3.13
# Run tests
python -m pytest tests/ -q
# Run with coverage
python -m pytest tests/ --cov=src/coderag --cov-report=term-missing
# Run E2E tests
bash tests/e2e/test_full_workflow.sh
📈 Codebase Stats
| Metric | Value |
|---|---|
| Python source files | 120 |
| Total source lines | 43,312 |
| Test files | 139 |
| Total test lines | 59,235 |
| Tests passing | 4,563 |
| Language plugins | 6 (PHP, JS, TS, Python, CSS, SCSS) + Vue SFC |
| Framework detectors | 11 |
| MCP tools | 16 |
| MCP resources | 3 |
| Node types | 41 |
| Edge types | 50 |
| CLI commands | 25 |
📚 Documentation
| Guide | Description |
|---|---|
| Quick Start | Installation and first steps |
| Smart Launcher | One-command AI session setup |
| Session Memory | Cross-session context persistence |
| Cost Savings | Token cost benchmarking |
| AI Tool Setup | Claude Code, Cursor, Codex configuration |
| Docker | Container deployment |
| Architecture | System design |
| Research | Language parsing research |
🤝 Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Write tests for your changes
- Ensure all tests pass (
python -m pytest tests/ -q) - Run linting (
ruff check src/ tests/ && ruff format src/ tests/) - Submit a pull request
📄 License
MIT — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file coderag_cli-0.2.0.tar.gz.
File metadata
- Download URL: coderag_cli-0.2.0.tar.gz
- Upload date:
- Size: 667.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6626aee5b3e95fa92d42dca091b68f4133c7e7c39d2ba358a435651fab3458cd
|
|
| MD5 |
94d7858e3f5bb4a72fc77a8f48b9ae89
|
|
| BLAKE2b-256 |
2733c35dc6310a4e3336213d5a3929a0e2f4b4853f7a83d35ec009a88d0df6ab
|
Provenance
The following attestation bundles were made for coderag_cli-0.2.0.tar.gz:
Publisher:
release.yml on dmnkhorvath/coderag-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
coderag_cli-0.2.0.tar.gz -
Subject digest:
6626aee5b3e95fa92d42dca091b68f4133c7e7c39d2ba358a435651fab3458cd - Sigstore transparency entry: 1171353510
- Sigstore integration time:
-
Permalink:
dmnkhorvath/coderag-cli@9dec8d633f5ae29ad6bdec29c296cf992e6cd8b7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/dmnkhorvath
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9dec8d633f5ae29ad6bdec29c296cf992e6cd8b7 -
Trigger Event:
push
-
Statement type:
File details
Details for the file coderag_cli-0.2.0-py3-none-any.whl.
File metadata
- Download URL: coderag_cli-0.2.0-py3-none-any.whl
- Upload date:
- Size: 385.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0d23f7e4831f3c25c487bd25d009adf18df42f4d8f14dde13266bb3f00b5d7a
|
|
| MD5 |
c59a17515ad060f27cc4585ef7db156d
|
|
| BLAKE2b-256 |
1ae5286323440354575882cb24957efd4a01e2061e950b751043e43611a12475
|
Provenance
The following attestation bundles were made for coderag_cli-0.2.0-py3-none-any.whl:
Publisher:
release.yml on dmnkhorvath/coderag-cli
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
coderag_cli-0.2.0-py3-none-any.whl -
Subject digest:
d0d23f7e4831f3c25c487bd25d009adf18df42f4d8f14dde13266bb3f00b5d7a - Sigstore transparency entry: 1171353711
- Sigstore integration time:
-
Permalink:
dmnkhorvath/coderag-cli@9dec8d633f5ae29ad6bdec29c296cf992e6cd8b7 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/dmnkhorvath
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@9dec8d633f5ae29ad6bdec29c296cf992e6cd8b7 -
Trigger Event:
push
-
Statement type: