Skip to main content

Reasoning-based code intelligence for any codebase

Project description

Repolect

Semantic code intelligence powered by LLM reasoning.

Index any codebase into a hierarchical semantic tree + knowledge graph. Ask questions, trace execution flows, plan changes, analyze impact โ€” all local-first, no vector database needed.

Python 3.10+ License: MIT Version MCP


Who is this for?

Repolect is built primarily for:

  • ๐Ÿง  Developers exploring new code: Quickly understand a project's architecture and logic better without reading thousands of lines of code.
  • ๐Ÿค– AI Coding Agent Users: Supercharge agents (like Cursor, Claude Code) with precise structural context to improve edit performance and significantly reduce hallucinations.
  • ๐Ÿ“Š Local-First Enthusiasts: Index, query, and beautifully visualize your codebase's dependencies entirely locally.
  • โšก SLM Power Users: Maximize the potential of locally hosted Small Language Models (via Ollama) to autonomously analyze, edit, and update your codebases.

Features

  • ๐ŸŒณ Hierarchical Semantic Tree: Every node (module, file, class, function) gets a bottom-up LLM-generated summary. The abstract meaning of your codebase is indexed, not just the raw text.
  • ๐ŸŽฏ Vectorless Search: Navigate the semantic tree using LLM reasoning (in O(log N) steps). Finds actual answers, saving huge amounts of tokens compared to blind similarity searches.
  • ๐Ÿ•ธ๏ธ Knowledge Graph: Maps CALLS, IMPORTS, EXTENDS, and IMPLEMENTS relations across your codebase. Useful for tracing execution paths or finding the "blast radius" of a change.
  • ๐Ÿ”Œ Full MCP Integration: Exposes 14 powerful tools to AI editors (Cursor, Claude Code etc) out of the box, drastically reducing token usage and round trips.
  • ๐Ÿ›ก๏ธ Prescriptive Agent Context: Generates "Agent Skills" depending on functional groups (Louvain communities) in your code to inject targeted context when and where it's needed.
  • ๐Ÿ”’ Local-First & SLM Optimized: Engineered to run perfectly on efficient local models like qwen3.5 or qwen2.5-coder via Ollama. No data leaves your machine unless you want it to.

How It Works

Repolect builds a hierarchical tree of your codebase where every node โ€” module, file, class, function โ€” gets an LLM-generated summary. Queries navigate this tree using LLM reasoning, finding relevant code in O(log N) steps without any vector similarity search.

RepoNode: "E-commerce backend in Python/FastAPI..."
โ”œโ”€โ”€ ModuleNode src/auth: "JWT-based authentication layer..."
โ”‚   โ”œโ”€โ”€ FileNode jwt.py: "Token generation and validation..."
โ”‚   โ”‚   โ”œโ”€โ”€ ClassNode JWTService: "Manages token lifecycle..."
โ”‚   โ”‚   โ””โ”€โ”€ FunctionNode verify_token: "Validates Bearer tokens..."
โ”‚   โ””โ”€โ”€ DocNode README.md: "Auth module documentation..."
โ””โ”€โ”€ ModuleNode src/payments: "Stripe payment processing..."

A knowledge graph runs alongside the tree, storing structural relations (CALLS, IMPORTS, EXTENDS, IMPLEMENTS) that power dependency analysis, impact tracing, and execution flow tracking.

Architecture

flowchart LR
    subgraph indexing [Indexing Pipeline]
        Scan[Scan Repo] --> Parse[Parse Files]
        Parse --> Summarize[LLM Summarize]
        Summarize --> Graph[Build Graph]
    end
 
    subgraph storage [Dual Storage]
        Tree["tree.json\n(semantic tree)"]
        GraphDB["graph.pkl / graph.db\n(knowledge graph)"]
    end
 
    subgraph query [Query Layer]
        CLI[CLI Commands]
        MCP[MCP Server]
    end
 
    Graph --> Tree
    Graph --> GraphDB
    Tree --> CLI
    Tree --> MCP
    GraphDB --> CLI
    GraphDB --> MCP

Quick Start

Recommended: One-liner Installer

The interactive installer sets up Ollama, configures your LLM provider, and makes repolect available system-wide:

curl -fsSL https://raw.githubusercontent.com/Bibyutatsu/Repolect/main/install.sh | bash

The installer uses pipx (isolated environment, no dependency conflicts) with a pip --user fallback. It automatically updates your shell PATH via a Conda-style marker block in .zshrc/.bashrc.

Install via pipx (recommended for CLI tools)

pipx install repolect
pipx inject repolect ollama          # for Ollama support
pipx inject repolect falkordblite    # for FalkorDB graph backend

Install from PyPI

pip install repolect[all]

Install from source

git clone https://github.com/Bibyutatsu/Repolect.git
cd Repolect
pip install -e ".[all]"

Index and query

cd your-project/
repolect analyze          # Index the codebase
repolect ask "how does authentication work?"

Requires an LLM provider. Repolect defaults to Ollama (local, free, private). See Configuration for other providers.


CLI Reference

Command Description Key Flags
repolect analyze Full index: semantic tree + knowledge graph + agent skills --force, --all-branches, --skills, --graph-backend, --parse-workers, --num-workers, --no-git, --quiet
repolect sync Incremental re-index (changed files only) --parse-workers, --num-workers, --quiet, --no-cache
repolect ask "query" Natural-language Q&A with citations --max-results, --quiet
repolect why <path> Explain why a file or symbol exists --repo
repolect tree Print the semantic tree --depth (default 3)
repolect graph "MATCH ..." Run Cypher queries on the knowledge graph --repo
repolect impact <symbol> Blast radius analysis --max-hops (default 3)
repolect diff Map git changes to affected symbols --ref (default HEAD~1), --with-impact
repolect communities Show functional clusters (Louvain) --repo
repolect list List all indexed repositories โ€”
repolect mcp Configure editors + start MCP server --serve (skip menu, start server directly), --scope global|project
repolect viz Launch Streamlit graph explorer --port (default 8501)

MCP Server Integration

The Model Context Protocol (MCP) lets AI editors use Repolect as a live code intelligence backend.

Auto-configure with repolect mcp

Running repolect mcp opens an interactive setup flow:

  1. Displays the config snippet you can copy into any editor manually
  2. Detects installed editors (Cursor, Claude Code, Antigravity, Windsurf, VS Code)
  3. Asks which to configure โ€” select by number or press a for all
  4. Writes/merges the correct JSON config into each editor automatically
$ repolect mcp

  ๐Ÿ”Œ Repolect MCP Server
  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

  Add this to your editor's MCP config file:

    {
      "mcpServers": {
        "repolect": {
          "command": "/usr/local/bin/repolect",
          "args": ["mcp", "--serve"]
        }
      }
    }

  Binary resolved to: /usr/local/bin/repolect

  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
  Detected editors:  [1] Cursor  ,  [2] Antigravity (Gemini)

  Enter numbers to auto-configure (e.g. 1,3), 'a' for all, or Enter to skip:
  โ†’ a

  Cursor  โ†’  ~/.cursor/mcp.json  [โœ“ written]
  Antigravity (Gemini)  โ†’  ~/.gemini/mcp.json  [โœ“ written]

  โœ… Done! Restart your editor for changes to take effect.

Manual config (all editors use the same format)

{
  "mcpServers": {
    "repolect": {
      "command": "repolect",
      "args": ["mcp", "--serve"]
    }
  }
}
Editor Config file
Cursor (global) ~/.cursor/mcp.json
Cursor (project) .cursor/mcp.json
Claude Code (global) ~/.claude.json โ†’ mcpServers
Claude Code (project) .mcp.json
Antigravity / Gemini ~/.gemini/mcp.json
Windsurf ~/.codeium/windsurf/mcp_config.json
VS Code (Copilot) ~/.vscode/mcp.json โ†’ servers

--serve flag: Use args: ["mcp", "--serve"] in your mcp.json. This skips the interactive menu and starts the stdio server directly โ€” which is what editors need.

MCP Tools

14 tools exposed via MCP:

Tool What It Does
tree_search Semantic search โ€” answers "how does X work?" using LLM tree reasoning
get_node 360-degree symbol view: source code, callers, callees, relations
explain_node LLM-powered explanation of why a symbol exists in the codebase
trace_flow Follow CALLS edges from an entry point to build an execution flow
graph_query Run raw Cypher queries against the knowledge graph
impact_analysis Blast radius: what breaks if you change a given symbol
diff_analysis Map git diff to affected symbols + downstream blast radius
plan_change Structured change plan: ADD / MODIFY / READ_ONLY / TEST_AFTER
find_similar Find an existing implementation to use as a template
get_conventions Extract coding conventions from a module's neighborhood
scope_test Find the minimal test set for modified nodes (MUST / SHOULD tiers)
rename Multi-file rename plan with graph + text search, confidence tagging
repo_summary Top-level codebase overview with stats and module descriptions
list_repos Discover all indexed repositories

Resources & Prompts

Resource Description
repolect://tree Full semantic tree as JSON
repolect://summary Top-level codebase overview
Prompt Description
code_search_guide Guided workflow: summary โ†’ search โ†’ node โ†’ trace
explain_codebase Generate a codebase explanation from the tree

Agent Skills & Context

Repolect influences AI agent behavior through three layers:

Layer 1: MCP Tools (what the agent can do)

The 14 tools listed above โ€” plan_change, tree_search, impact_analysis, etc.

Layer 2: Prescriptive Context File (what the agent should do)

repolect analyze generates REPOLECT.md at the repo root with:

  • "Always Do" rules โ€” call plan_change before changes, find_similar before creating, get_conventions before modifying, diff_analysis before committing, scope_test after changes
  • "Never Do" rules โ€” never skip impact analysis on widely-used symbols, never commit without diff_analysis
  • Debugging and Refactoring workflows โ€” step-by-step tool chains
  • Community map โ€” Louvain-detected functional areas with key symbols
  • Marker-based upsert โ€” re-indexing replaces only the Repolect section, preserving any user-written content

Layer 3: Workflow Skills (what the agent does in specific situations)

Static skills (installed every repolect analyze):

Skill Trigger
repolect-exploring Navigating unfamiliar code, "how does X work?"
repolect-planning Before implementing any feature or change
repolect-debugging Tracing bugs, investigating errors
repolect-refactoring Renaming, extracting, restructuring
repolect-reviewing Pre-commit safety checks, code review

Generated community skills (repolect analyze --skills):

Per-community skill files describing each functional area of the codebase โ€” key files, entry points, cross-community connections, associated tests, and LLM-synthesized descriptions of what each area does.

Skills are auto-installed into detected editors:

  • Cursor: .cursor/rules/repolect-*.mdc
  • Claude Code: .claude/skills/repolect/*.md

Configuration

Repolect reads from ~/.repolect/config.yaml:

# LLM Provider
provider: ollama                    # or "openai-compatible"
base_url: http://localhost:11434    # or your API endpoint
model_name: qwen3.5:4b             # your preferred model
api_key: ""                         # empty for Ollama
 
# Embeddings (optional โ€” enables hybrid vector+tree search)
embedding_provider: ollama
embedding_model: qwen3-embedding:0.6b
Using an OpenAI-compatible API
provider: openai-compatible
base_url: https://api.openai.com/v1
model_name: gpt-4o-mini
api_key: sk-...
 
embedding_provider: openai-compatible
embedding_model: text-embedding-3-small
embedding_api_key: sk-...

Environment variables override config: REPOLECT_PROVIDER, REPOLECT_BASE_URL, REPOLECT_MODEL, REPOLECT_API_KEY, REPOLECT_EMBEDDINGS (1/0).


Why Vectorless?

Vector similarity finds files that are similar to your query โ€” not files that answer it.

"How does payment work?" doesn't semantically resemble stripe_adapter.py. LLM reasoning over a structured tree does.

Repolect's tree search operates in O(log N) LLM calls: probe the root, pick the most relevant branch, descend until you reach the answer. Every node has a pre-computed summary, so the LLM reasons about meaning, not similarity.

Embeddings are optional โ€” enable them for hybrid search when you want both approaches.


MCP Performance Analysis

Benchmarked across 8 complex real-world coding scenarios on Repolect's own codebase (807 nodes, 28 files).

Summary

Metric Without MCP Tools With MCP Tools Improvement
Input tokens 330,363 10,964 97% reduction
Tool calls 87 17 5.1x fewer
Round trips 34 9 3.8x fewer
Tokens saved โ€” โ€” 319,399

Tool Tier Ranking

Tier 1 โ€” Transformative (use on every task):

Tool Value
plan_change Replaces 15+ calls with 1 structured roadmap
tree_search Answers "how does X work?" without reading any file
trace_flow 82-node call graph impossible to build manually
diff_analysis Pre-commit safety net in 1 call vs 14+

Tier 2 โ€” High Value (use frequently):

Tool Value
find_similar Template + copy/replace/match advice
impact_analysis Multi-hop blast radius with test tagging
rename Graph + text confidence tagging
scope_test Specific test names with MUST/SHOULD tiers
get_node 360-degree symbol view replaces 4+ calls

Tier 3 โ€” Useful (for specific tasks):

Tool Value
get_conventions 8 convention categories from neighboring code
graph_query Structural questions impossible without a graph
explain_node LLM-powered context for unfamiliar symbols
repo_summary Quick orientation for first interaction

In Practice

For a typical coding session with 5โ€“10 tasks, Repolect MCP tools save approximately:

  • ~150,000โ€“300,000 input tokens (~$0.45โ€“$0.90 per session at $0.003/1K tokens)
  • 30โ€“50 tool calls reduced to 8โ€“15
  • 15โ€“25 round trips reduced to 5โ€“8 (each round trip = 2โ€“5 seconds of latency)
  • 30โ€“120 seconds of latency eliminated from fewer round trips

Project Structure

repolect/
โ”œโ”€โ”€ __init__.py          # Package exports and version
โ”œโ”€โ”€ cli.py               # Click CLI commands (analyze, ask, sync, mcp, ...)
โ”œโ”€โ”€ config.py            # Config loading (~/.repolect/config.yaml)
โ”œโ”€โ”€ embedder.py          # Optional vector embeddings (Ollama, OpenAI)
โ”œโ”€โ”€ git_utils.py         # Git operations (branch, diff, hash, etc.)
โ”œโ”€โ”€ graph_db.py          # Knowledge graph (NetworkX + FalkorDB backends)
โ”œโ”€โ”€ mcp_server.py        # MCP server with 14 tools, 2 resources, 2 prompts
โ”œโ”€โ”€ models.py            # Core data models (CodeNode, Relation, TreeMeta)
โ”œโ”€โ”€ parser.py            # Hybrid parser (tree-sitter + regex enhancer)
โ”œโ”€โ”€ search.py            # Tree search, explanation, flow tracing
โ”œโ”€โ”€ skill_installer.py   # Agent skill installer (static + generated community skills)
โ”œโ”€โ”€ skills/              # Static workflow skills (exploring, planning, debugging, ...)
โ”œโ”€โ”€ storage.py           # Persistence (tree.json, meta.json, REPOLECT.md)
โ”œโ”€โ”€ summarizer.py        # Bottom-up LLM summarization pipeline
โ””โ”€โ”€ tree_builder.py      # Indexing orchestrator (scan โ†’ parse โ†’ link โ†’ graph)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repolect-0.1.2.tar.gz (106.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repolect-0.1.2-py3-none-any.whl (106.0 kB view details)

Uploaded Python 3

File details

Details for the file repolect-0.1.2.tar.gz.

File metadata

  • Download URL: repolect-0.1.2.tar.gz
  • Upload date:
  • Size: 106.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for repolect-0.1.2.tar.gz
Algorithm Hash digest
SHA256 8e64d929b453addb694f15da9cb6f7ef5270d66046695d3c8d33eeb84f1c7a09
MD5 6290351021b0b8db027141e36eb6fc3f
BLAKE2b-256 9a9fe8be7fd8e9b786d1a5a0a7543557bce986d31597c8efbb05f122999d558d

See more details on using hashes here.

Provenance

The following attestation bundles were made for repolect-0.1.2.tar.gz:

Publisher: publish.yml on Bibyutatsu/Repolect

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file repolect-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: repolect-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 106.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for repolect-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f57e369f542b4f9c04c320ac598e0a7228557d7524f2d48f505cfee35a914b94
MD5 b6ff73f24b5af5642534cdfc62797a14
BLAKE2b-256 d039e7421457185ef2db2aea601960e499346d1b7f1f3c64ee4dfa3290025bc9

See more details on using hashes here.

Provenance

The following attestation bundles were made for repolect-0.1.2-py3-none-any.whl:

Publisher: publish.yml on Bibyutatsu/Repolect

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page