The ultimate RAG for your monorepo. Query, understand, and edit multi-language codebases with the power of AI and knowledge graphs
Project description
Code-Graph-RAG
A graph-based RAG system that parses multi-language codebases with Tree-sitter, builds knowledge graphs in Memgraph, and enables natural language querying, editing, and optimization.
Install
pip install code-graph-rag
With all Tree-sitter grammars (Python, JS, TS, Rust, Go, Java, Scala, C++, Lua):
pip install 'code-graph-rag[treesitter-full]'
With semantic code search (UniXcoder embeddings):
pip install 'code-graph-rag[semantic]'
Prerequisites
- Python 3.12+
- Docker (for Memgraph)
cmake(for building pymgclient)ripgrep(rg) (for shell command text searching)
CLI Quick Start
The package installs a cgr command.
Start Memgraph, parse a repo, and query it:
docker compose up -d # start Memgraph
cgr start --repo-path ./my-project \
--update-graph --clean # parse & launch interactive chat
Index to protobuf for offline use:
cgr index -o ./index-output --repo-path ./my-project
Export knowledge graph to JSON:
cgr export -o graph.json
AI-guided optimization:
cgr optimize python --repo-path ./my-project
Run as an MCP server (for Claude Code):
cgr mcp-server
Check your setup:
cgr doctor
Python SDK
The cgr package provides short imports for programmatic use.
Load and query an exported graph
from cgr import load_graph
graph = load_graph("graph.json")
print(graph.summary())
functions = graph.find_nodes_by_label("Function")
for fn in functions[:5]:
rels = graph.get_relationships_for_node(fn.node_id)
print(f"{fn.properties['name']}: {len(rels)} relationships")
Query Memgraph with Cypher
from cgr import MemgraphIngestor
with MemgraphIngestor(host="localhost", port=7687) as db:
rows = db.fetch_all("MATCH (f:Function) RETURN f.name LIMIT 10")
for row in rows:
print(row)
Generate Cypher from natural language
import asyncio
from cgr import CypherGenerator
async def main():
gen = CypherGenerator()
cypher = await gen.generate("Find all classes that inherit from BaseModel")
print(cypher)
asyncio.run(main())
Semantic code search
Requires the semantic extra.
from cgr import embed_code
embedding = embed_code("def authenticate(user, password): ...")
print(f"Embedding dimension: {len(embedding)}")
Configuration
from cgr import settings
settings.set_orchestrator("openai", "gpt-4o", api_key="sk-...")
settings.set_cypher("google", "gemini-2.5-flash", api_key="your-key")
Environment Variables
Configure via .env or environment variables:
| Variable | Default | Description |
|---|---|---|
MEMGRAPH_HOST |
localhost |
Memgraph hostname |
MEMGRAPH_PORT |
7687 |
Memgraph port |
ORCHESTRATOR_PROVIDER |
Provider: google, openai, ollama |
|
ORCHESTRATOR_MODEL |
Model ID (e.g. gpt-4o, gemini-2.5-pro) |
|
ORCHESTRATOR_API_KEY |
API key for the provider (not needed for ollama) |
|
CYPHER_PROVIDER |
Provider for Cypher generation | |
CYPHER_MODEL |
Model ID for Cypher generation (e.g. codellama, gpt-4o-mini) |
|
CYPHER_API_KEY |
API key for Cypher provider (not needed for ollama) |
|
TARGET_REPO_PATH |
. |
Default repository path |
Documentation
Full documentation, architecture details, and contribution guide: docs.code-graph-rag.com
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file code_graph_rag-0.0.148.tar.gz.
File metadata
- Download URL: code_graph_rag-0.0.148.tar.gz
- Upload date:
- Size: 209.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53e7c603ac589fab029800f2db7976e672d5388fbd3bae279a8669cb52185c4e
|
|
| MD5 |
9cdaeafba3a333780063f668a5861702
|
|
| BLAKE2b-256 |
bc18c862ffd4579f40b9eaa45ad455e687f74d5caca481a6fb181bf0f89505ea
|
Provenance
The following attestation bundles were made for code_graph_rag-0.0.148.tar.gz:
Publisher:
publish.yml on vitali87/code-graph-rag
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
code_graph_rag-0.0.148.tar.gz -
Subject digest:
53e7c603ac589fab029800f2db7976e672d5388fbd3bae279a8669cb52185c4e - Sigstore transparency entry: 1155542058
- Sigstore integration time:
-
Permalink:
vitali87/code-graph-rag@08c95e195ef58028102a02d4ca00a588548c2d93 -
Branch / Tag:
refs/tags/v0.0.148 - Owner: https://github.com/vitali87
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@08c95e195ef58028102a02d4ca00a588548c2d93 -
Trigger Event:
release
-
Statement type:
File details
Details for the file code_graph_rag-0.0.148-py3-none-any.whl.
File metadata
- Download URL: code_graph_rag-0.0.148-py3-none-any.whl
- Upload date:
- Size: 243.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5a544c61f70d365a6532f7d93e71c39c2dd82a449452d98daa30db6dedd79fb
|
|
| MD5 |
75fe131e69fb2d5a81796fb63ba207f9
|
|
| BLAKE2b-256 |
fc501ed5088c848d7679d6d860c5e29684a6f351fb2ab9cb087172e457debc50
|
Provenance
The following attestation bundles were made for code_graph_rag-0.0.148-py3-none-any.whl:
Publisher:
publish.yml on vitali87/code-graph-rag
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
code_graph_rag-0.0.148-py3-none-any.whl -
Subject digest:
c5a544c61f70d365a6532f7d93e71c39c2dd82a449452d98daa30db6dedd79fb - Sigstore transparency entry: 1155542064
- Sigstore integration time:
-
Permalink:
vitali87/code-graph-rag@08c95e195ef58028102a02d4ca00a588548c2d93 -
Branch / Tag:
refs/tags/v0.0.148 - Owner: https://github.com/vitali87
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@08c95e195ef58028102a02d4ca00a588548c2d93 -
Trigger Event:
release
-
Statement type: