Skip to main content

The ultimate RAG for your monorepo. Query, understand, and edit multi-language codebases with the power of AI and knowledge graphs

Project description

Code-Graph-RAG

A graph-based RAG system that parses multi-language codebases with Tree-sitter, builds knowledge graphs in Memgraph, and enables natural language querying, editing, and optimization.

Install

pip install code-graph-rag

With all Tree-sitter grammars (Python, JS, TS, Rust, Go, Java, Scala, C++, Lua):

pip install 'code-graph-rag[treesitter-full]'

With semantic code search (UniXcoder embeddings):

pip install 'code-graph-rag[semantic]'

Prerequisites

  • Python 3.12+
  • Docker (for Memgraph)
  • cmake (for building pymgclient)
  • ripgrep (rg) (for shell command text searching)

CLI Quick Start

The package installs a cgr command.

Start Memgraph, parse a repo, and query it:

docker compose up -d                       # start Memgraph
cgr start --repo-path ./my-project \
          --update-graph --clean           # parse & launch interactive chat

Index to protobuf for offline use:

cgr index -o ./index-output --repo-path ./my-project

Export knowledge graph to JSON:

cgr export -o graph.json

AI-guided optimization:

cgr optimize python --repo-path ./my-project

Run as an MCP server (for Claude Code):

cgr mcp-server

Check your setup:

cgr doctor

Python SDK

The cgr package provides short imports for programmatic use.

Load and query an exported graph

from cgr import load_graph

graph = load_graph("graph.json")
print(graph.summary())

functions = graph.find_nodes_by_label("Function")
for fn in functions[:5]:
    rels = graph.get_relationships_for_node(fn.node_id)
    print(f"{fn.properties['name']}: {len(rels)} relationships")

Query Memgraph with Cypher

from cgr import MemgraphIngestor

with MemgraphIngestor(host="localhost", port=7687) as db:
    rows = db.fetch_all("MATCH (f:Function) RETURN f.name LIMIT 10")
    for row in rows:
        print(row)

Generate Cypher from natural language

import asyncio
from cgr import CypherGenerator

async def main():
    gen = CypherGenerator()
    cypher = await gen.generate("Find all classes that inherit from BaseModel")
    print(cypher)

asyncio.run(main())

Semantic code search

Requires the semantic extra.

from cgr import embed_code

embedding = embed_code("def authenticate(user, password): ...")
print(f"Embedding dimension: {len(embedding)}")

Configuration

from cgr import settings

settings.set_orchestrator("openai", "gpt-4o", api_key="sk-...")
settings.set_cypher("google", "gemini-2.5-flash", api_key="your-key")

Environment Variables

Configure via .env or environment variables:

Variable Default Description
MEMGRAPH_HOST localhost Memgraph hostname
MEMGRAPH_PORT 7687 Memgraph port
ORCHESTRATOR_PROVIDER Provider: google, openai, ollama
ORCHESTRATOR_MODEL Model ID (e.g. gpt-4o, gemini-2.5-pro)
ORCHESTRATOR_API_KEY API key for the provider (not needed for ollama)
CYPHER_PROVIDER Provider for Cypher generation
CYPHER_MODEL Model ID for Cypher generation (e.g. codellama, gpt-4o-mini)
CYPHER_API_KEY API key for Cypher provider (not needed for ollama)
TARGET_REPO_PATH . Default repository path

Documentation

Full documentation, architecture details, and contribution guide: docs.code-graph-rag.com

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_graph_rag-0.0.148.tar.gz (209.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_graph_rag-0.0.148-py3-none-any.whl (243.3 kB view details)

Uploaded Python 3

File details

Details for the file code_graph_rag-0.0.148.tar.gz.

File metadata

  • Download URL: code_graph_rag-0.0.148.tar.gz
  • Upload date:
  • Size: 209.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for code_graph_rag-0.0.148.tar.gz
Algorithm Hash digest
SHA256 53e7c603ac589fab029800f2db7976e672d5388fbd3bae279a8669cb52185c4e
MD5 9cdaeafba3a333780063f668a5861702
BLAKE2b-256 bc18c862ffd4579f40b9eaa45ad455e687f74d5caca481a6fb181bf0f89505ea

See more details on using hashes here.

Provenance

The following attestation bundles were made for code_graph_rag-0.0.148.tar.gz:

Publisher: publish.yml on vitali87/code-graph-rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file code_graph_rag-0.0.148-py3-none-any.whl.

File metadata

File hashes

Hashes for code_graph_rag-0.0.148-py3-none-any.whl
Algorithm Hash digest
SHA256 c5a544c61f70d365a6532f7d93e71c39c2dd82a449452d98daa30db6dedd79fb
MD5 75fe131e69fb2d5a81796fb63ba207f9
BLAKE2b-256 fc501ed5088c848d7679d6d860c5e29684a6f351fb2ab9cb087172e457debc50

See more details on using hashes here.

Provenance

The following attestation bundles were made for code_graph_rag-0.0.148-py3-none-any.whl:

Publisher: publish.yml on vitali87/code-graph-rag

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page