Reasoning-based code intelligence for any codebase
Project description
Repolect
Semantic code intelligence powered by LLM reasoning.
Index any codebase into a hierarchical semantic tree + knowledge graph. Ask questions, trace execution flows, plan changes, analyze impact โ all local-first, no vector database needed.
Who is this for?
Repolect is built primarily for:
- ๐ง Developers exploring new code: Quickly understand a project's architecture and logic better without reading thousands of lines of code.
- ๐ค AI Coding Agent Users: Supercharge agents (like Cursor, Claude Code) with precise structural context to improve edit performance and significantly reduce hallucinations.
- ๐ Local-First Enthusiasts: Index, query, and beautifully visualize your codebase's dependencies entirely locally.
- โก SLM Power Users: Maximize the potential of locally hosted Small Language Models (via Ollama) to autonomously analyze, edit, and update your codebases.
Features
- ๐ณ Hierarchical Semantic Tree: Every node (module, file, class, function) gets a bottom-up LLM-generated summary. The abstract meaning of your codebase is indexed, not just the raw text.
- ๐ฏ Vectorless Search: Navigate the semantic tree using LLM reasoning (in O(log N) steps). Finds actual answers, saving huge amounts of tokens compared to blind similarity searches.
- ๐ธ๏ธ Knowledge Graph: Maps
CALLS,IMPORTS,EXTENDS, andIMPLEMENTSrelations across your codebase. Useful for tracing execution paths or finding the "blast radius" of a change. - ๐ Full MCP Integration: Exposes 14 powerful tools to AI editors (Cursor, Claude Code etc) out of the box, drastically reducing token usage and round trips.
- ๐ก๏ธ Prescriptive Agent Context: Generates "Agent Skills" depending on functional groups (Louvain communities) in your code to inject targeted context when and where it's needed.
- ๐ Local-First & SLM Optimized: Engineered to run perfectly on efficient local models like
qwen3.5orqwen2.5-codervia Ollama. No data leaves your machine unless you want it to.
How It Works
Repolect builds a hierarchical tree of your codebase where every node โ module, file, class, function โ gets an LLM-generated summary. Queries navigate this tree using LLM reasoning, finding relevant code in O(log N) steps without any vector similarity search.
RepoNode: "E-commerce backend in Python/FastAPI..."
โโโ ModuleNode src/auth: "JWT-based authentication layer..."
โ โโโ FileNode jwt.py: "Token generation and validation..."
โ โ โโโ ClassNode JWTService: "Manages token lifecycle..."
โ โ โโโ FunctionNode verify_token: "Validates Bearer tokens..."
โ โโโ DocNode README.md: "Auth module documentation..."
โโโ ModuleNode src/payments: "Stripe payment processing..."
A knowledge graph runs alongside the tree, storing structural relations (CALLS, IMPORTS, EXTENDS, IMPLEMENTS) that power dependency analysis, impact tracing, and execution flow tracking.
Architecture
flowchart LR
subgraph indexing [Indexing Pipeline]
Scan[Scan Repo] --> Parse[Parse Files]
Parse --> Summarize[LLM Summarize]
Summarize --> Graph[Build Graph]
end
subgraph storage [Dual Storage]
Tree["tree.json\n(semantic tree)"]
GraphDB["graph.pkl / graph.db\n(knowledge graph)"]
end
subgraph query [Query Layer]
CLI[CLI Commands]
MCP[MCP Server]
end
Graph --> Tree
Graph --> GraphDB
Tree --> CLI
Tree --> MCP
GraphDB --> CLI
GraphDB --> MCP
Quick Start
Recommended: One-liner Installer
The interactive installer sets up Ollama, configures your LLM provider, and makes repolect available system-wide:
curl -fsSL https://raw.githubusercontent.com/Bibyutatsu/Repolect/main/install.sh | bash
The installer uses pipx (isolated environment, no dependency conflicts) with a pip --user fallback. It automatically updates your shell PATH via a Conda-style marker block in .zshrc/.bashrc.
Install via pipx (recommended for CLI tools)
pipx install repolect
pipx inject repolect ollama # for Ollama support
pipx inject repolect falkordblite # for FalkorDB graph backend
Install from PyPI
pip install repolect[all]
Install from source
git clone https://github.com/Bibyutatsu/Repolect.git
cd Repolect
pip install -e ".[all]"
Index and query
cd your-project/
repolect analyze # Index the codebase
repolect ask "how does authentication work?"
Requires an LLM provider. Repolect defaults to Ollama (local, free, private). See Configuration for other providers.
CLI Reference
| Command | Description | Key Flags |
|---|---|---|
repolect analyze |
Full index: semantic tree + knowledge graph + agent skills | --force, --all-branches, --skills, --graph-backend, --parse-workers, --num-workers, --no-git, --quiet |
repolect sync |
Incremental re-index (changed files only) | --parse-workers, --num-workers, --quiet, --no-cache |
repolect ask "query" |
Natural-language Q&A with citations | --max-results, --quiet |
repolect why <path> |
Explain why a file or symbol exists | --repo |
repolect tree |
Print the semantic tree | --depth (default 3) |
repolect graph "MATCH ..." |
Run Cypher queries on the knowledge graph | --repo |
repolect impact <symbol> |
Blast radius analysis | --max-hops (default 3) |
repolect diff |
Map git changes to affected symbols | --ref (default HEAD~1), --with-impact |
repolect communities |
Show functional clusters (Louvain) | --repo |
repolect list |
List all indexed repositories | โ |
repolect mcp |
Configure editors + start MCP server | --serve (skip menu, start server directly), --scope global|project |
repolect viz |
Launch Streamlit graph explorer | --port (default 8501) |
MCP Server Integration
The Model Context Protocol (MCP) lets AI editors use Repolect as a live code intelligence backend.
Auto-configure with repolect mcp
Running repolect mcp opens an interactive setup flow:
- Displays the config snippet you can copy into any editor manually
- Detects installed editors (Cursor, Claude Code, Antigravity, Windsurf, VS Code)
- Asks which to configure โ select by number or press
afor all - Writes/merges the correct JSON config into each editor automatically
$ repolect mcp
๐ Repolect MCP Server
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Add this to your editor's MCP config file:
{
"mcpServers": {
"repolect": {
"command": "/usr/local/bin/repolect",
"args": ["mcp", "--serve"]
}
}
}
Binary resolved to: /usr/local/bin/repolect
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Detected editors: [1] Cursor , [2] Antigravity (Gemini)
Enter numbers to auto-configure (e.g. 1,3), 'a' for all, or Enter to skip:
โ a
Cursor โ ~/.cursor/mcp.json [โ written]
Antigravity (Gemini) โ ~/.gemini/mcp.json [โ written]
โ
Done! Restart your editor for changes to take effect.
Manual config (all editors use the same format)
{
"mcpServers": {
"repolect": {
"command": "repolect",
"args": ["mcp", "--serve"]
}
}
}
| Editor | Config file |
|---|---|
| Cursor (global) | ~/.cursor/mcp.json |
| Cursor (project) | .cursor/mcp.json |
| Claude Code (global) | ~/.claude.json โ mcpServers |
| Claude Code (project) | .mcp.json |
| Antigravity / Gemini | ~/.gemini/mcp.json |
| Windsurf | ~/.codeium/windsurf/mcp_config.json |
| VS Code (Copilot) | ~/.vscode/mcp.json โ servers |
--serveflag: Useargs: ["mcp", "--serve"]in your mcp.json. This skips the interactive menu and starts the stdio server directly โ which is what editors need.
MCP Tools
14 tools exposed via MCP:
| Tool | What It Does |
|---|---|
tree_search |
Semantic search โ answers "how does X work?" using LLM tree reasoning |
get_node |
360-degree symbol view: source code, callers, callees, relations |
explain_node |
LLM-powered explanation of why a symbol exists in the codebase |
trace_flow |
Follow CALLS edges from an entry point to build an execution flow |
graph_query |
Run raw Cypher queries against the knowledge graph |
impact_analysis |
Blast radius: what breaks if you change a given symbol |
diff_analysis |
Map git diff to affected symbols + downstream blast radius |
plan_change |
Structured change plan: ADD / MODIFY / READ_ONLY / TEST_AFTER |
find_similar |
Find an existing implementation to use as a template |
get_conventions |
Extract coding conventions from a module's neighborhood |
scope_test |
Find the minimal test set for modified nodes (MUST / SHOULD tiers) |
rename |
Multi-file rename plan with graph + text search, confidence tagging |
repo_summary |
Top-level codebase overview with stats and module descriptions |
list_repos |
Discover all indexed repositories |
Resources & Prompts
| Resource | Description |
|---|---|
repolect://tree |
Full semantic tree as JSON |
repolect://summary |
Top-level codebase overview |
| Prompt | Description |
|---|---|
code_search_guide |
Guided workflow: summary โ search โ node โ trace |
explain_codebase |
Generate a codebase explanation from the tree |
Agent Skills & Context
Repolect influences AI agent behavior through three layers:
Layer 1: MCP Tools (what the agent can do)
The 14 tools listed above โ plan_change, tree_search, impact_analysis, etc.
Layer 2: Prescriptive Context File (what the agent should do)
repolect analyze generates REPOLECT.md at the repo root with:
- "Always Do" rules โ call
plan_changebefore changes,find_similarbefore creating,get_conventionsbefore modifying,diff_analysisbefore committing,scope_testafter changes - "Never Do" rules โ never skip impact analysis on widely-used symbols, never commit without
diff_analysis - Debugging and Refactoring workflows โ step-by-step tool chains
- Community map โ Louvain-detected functional areas with key symbols
- Marker-based upsert โ re-indexing replaces only the Repolect section, preserving any user-written content
Layer 3: Workflow Skills (what the agent does in specific situations)
Static skills (installed every repolect analyze):
| Skill | Trigger |
|---|---|
repolect-exploring |
Navigating unfamiliar code, "how does X work?" |
repolect-planning |
Before implementing any feature or change |
repolect-debugging |
Tracing bugs, investigating errors |
repolect-refactoring |
Renaming, extracting, restructuring |
repolect-reviewing |
Pre-commit safety checks, code review |
Generated community skills (repolect analyze --skills):
Per-community skill files describing each functional area of the codebase โ key files, entry points, cross-community connections, associated tests, and LLM-synthesized descriptions of what each area does.
Skills are auto-installed into detected editors:
- Cursor:
.cursor/rules/repolect-*.mdc - Claude Code:
.claude/skills/repolect/*.md
Configuration
Repolect reads from ~/.repolect/config.yaml:
# LLM Provider
provider: ollama # or "openai-compatible"
base_url: http://localhost:11434 # or your API endpoint
model_name: qwen3.5:4b # your preferred model
api_key: "" # empty for Ollama
# Embeddings (optional โ enables hybrid vector+tree search)
embedding_provider: ollama
embedding_model: qwen3-embedding:0.6b
Using an OpenAI-compatible API
provider: openai-compatible
base_url: https://api.openai.com/v1
model_name: gpt-4o-mini
api_key: sk-...
embedding_provider: openai-compatible
embedding_model: text-embedding-3-small
embedding_api_key: sk-...
Environment variables override config: REPOLECT_PROVIDER, REPOLECT_BASE_URL, REPOLECT_MODEL, REPOLECT_API_KEY, REPOLECT_EMBEDDINGS (1/0).
Why Vectorless?
Vector similarity finds files that are similar to your query โ not files that answer it.
"How does payment work?" doesn't semantically resemble
stripe_adapter.py. LLM reasoning over a structured tree does.
Repolect's tree search operates in O(log N) LLM calls: probe the root, pick the most relevant branch, descend until you reach the answer. Every node has a pre-computed summary, so the LLM reasons about meaning, not similarity.
Embeddings are optional โ enable them for hybrid search when you want both approaches.
MCP Performance Analysis
Benchmarked across 8 complex real-world coding scenarios on Repolect's own codebase (807 nodes, 28 files).
Summary
| Metric | Without MCP Tools | With MCP Tools | Improvement |
|---|---|---|---|
| Input tokens | 330,363 | 10,964 | 97% reduction |
| Tool calls | 87 | 17 | 5.1x fewer |
| Round trips | 34 | 9 | 3.8x fewer |
| Tokens saved | โ | โ | 319,399 |
Tool Tier Ranking
Tier 1 โ Transformative (use on every task):
| Tool | Value |
|---|---|
plan_change |
Replaces 15+ calls with 1 structured roadmap |
tree_search |
Answers "how does X work?" without reading any file |
trace_flow |
82-node call graph impossible to build manually |
diff_analysis |
Pre-commit safety net in 1 call vs 14+ |
Tier 2 โ High Value (use frequently):
| Tool | Value |
|---|---|
find_similar |
Template + copy/replace/match advice |
impact_analysis |
Multi-hop blast radius with test tagging |
rename |
Graph + text confidence tagging |
scope_test |
Specific test names with MUST/SHOULD tiers |
get_node |
360-degree symbol view replaces 4+ calls |
Tier 3 โ Useful (for specific tasks):
| Tool | Value |
|---|---|
get_conventions |
8 convention categories from neighboring code |
graph_query |
Structural questions impossible without a graph |
explain_node |
LLM-powered context for unfamiliar symbols |
repo_summary |
Quick orientation for first interaction |
In Practice
For a typical coding session with 5โ10 tasks, Repolect MCP tools save approximately:
- ~150,000โ300,000 input tokens (~$0.45โ$0.90 per session at $0.003/1K tokens)
- 30โ50 tool calls reduced to 8โ15
- 15โ25 round trips reduced to 5โ8 (each round trip = 2โ5 seconds of latency)
- 30โ120 seconds of latency eliminated from fewer round trips
Project Structure
repolect/
โโโ __init__.py # Package exports and version
โโโ cli.py # Click CLI commands (analyze, ask, sync, mcp, ...)
โโโ config.py # Config loading (~/.repolect/config.yaml)
โโโ embedder.py # Optional vector embeddings (Ollama, OpenAI)
โโโ git_utils.py # Git operations (branch, diff, hash, etc.)
โโโ graph_db.py # Knowledge graph (NetworkX + FalkorDB backends)
โโโ mcp_server.py # MCP server with 14 tools, 2 resources, 2 prompts
โโโ models.py # Core data models (CodeNode, Relation, TreeMeta)
โโโ parser.py # Hybrid parser (tree-sitter + regex enhancer)
โโโ search.py # Tree search, explanation, flow tracing
โโโ skill_installer.py # Agent skill installer (static + generated community skills)
โโโ skills/ # Static workflow skills (exploring, planning, debugging, ...)
โโโ storage.py # Persistence (tree.json, meta.json, REPOLECT.md)
โโโ summarizer.py # Bottom-up LLM summarization pipeline
โโโ tree_builder.py # Indexing orchestrator (scan โ parse โ link โ graph)
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file repolect-0.1.2.tar.gz.
File metadata
- Download URL: repolect-0.1.2.tar.gz
- Upload date:
- Size: 106.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e64d929b453addb694f15da9cb6f7ef5270d66046695d3c8d33eeb84f1c7a09
|
|
| MD5 |
6290351021b0b8db027141e36eb6fc3f
|
|
| BLAKE2b-256 |
9a9fe8be7fd8e9b786d1a5a0a7543557bce986d31597c8efbb05f122999d558d
|
Provenance
The following attestation bundles were made for repolect-0.1.2.tar.gz:
Publisher:
publish.yml on Bibyutatsu/Repolect
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
repolect-0.1.2.tar.gz -
Subject digest:
8e64d929b453addb694f15da9cb6f7ef5270d66046695d3c8d33eeb84f1c7a09 - Sigstore transparency entry: 1191225829
- Sigstore integration time:
-
Permalink:
Bibyutatsu/Repolect@fb5de840ae51d493ce6e2a3af008663fd478f757 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/Bibyutatsu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@fb5de840ae51d493ce6e2a3af008663fd478f757 -
Trigger Event:
release
-
Statement type:
File details
Details for the file repolect-0.1.2-py3-none-any.whl.
File metadata
- Download URL: repolect-0.1.2-py3-none-any.whl
- Upload date:
- Size: 106.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f57e369f542b4f9c04c320ac598e0a7228557d7524f2d48f505cfee35a914b94
|
|
| MD5 |
b6ff73f24b5af5642534cdfc62797a14
|
|
| BLAKE2b-256 |
d039e7421457185ef2db2aea601960e499346d1b7f1f3c64ee4dfa3290025bc9
|
Provenance
The following attestation bundles were made for repolect-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on Bibyutatsu/Repolect
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
repolect-0.1.2-py3-none-any.whl -
Subject digest:
f57e369f542b4f9c04c320ac598e0a7228557d7524f2d48f505cfee35a914b94 - Sigstore transparency entry: 1191225830
- Sigstore integration time:
-
Permalink:
Bibyutatsu/Repolect@fb5de840ae51d493ce6e2a3af008663fd478f757 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/Bibyutatsu
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@fb5de840ae51d493ce6e2a3af008663fd478f757 -
Trigger Event:
release
-
Statement type: