Skip to main content

Local RAG indexer and MCP server for AI coding agents (Claude, GPT, Gemini, Cursor, Factory Droid, and more).

Project description

repo-rag

repo-rag

PyPI version CI License: MIT Python

Local RAG indexer and MCP server for AI coding agents. Run it once per repo, and every MCP-compatible agent on your machine - Claude Code, Claude Desktop, Cursor, Windsurf, Codex CLI/Desktop, Gemini CLI, Factory Droid, MiniMax Agent, Antigravity, Aider, Cline, Continue.dev, Zed, and any future AGENTS.md-aware tool - searches your code through the same hybrid keyword + vector index instead of grepping blind.

Quickstart

# Recommended: isolated global CLI via pipx
pipx install repo-rag

# Or inside an existing project venv
pip install repo-rag

cd /path/to/your/repo
rag init
rag rebuild

rag agents setup --all      # writes rules files + MCP configs for every detected agent
rag hooks install           # keep the index fresh on every commit / merge / checkout

That's it. Open Claude Code, Cursor, or any other supported agent and ask "where is auth configured" - the agent will call repo_rag_search first.

repo-rag pulls in lancedb, pyarrow, model2vec, fastembed, tree-sitter, and onnxruntime, so expect roughly 500 MB on disk for the dependency stack regardless of install method. pipx keeps that footprint in one isolated environment instead of every project venv.

What you get

  • One index, every agent. Indexed under ~/.repo-rag/<repo-id>/ and shared across every MCP client. No per-tool re-embedding.
  • Hybrid retrieval. SQLite FTS5 BM25 keyword search plus LanceDB vector search, merged with configurable weights.
  • AST-aware chunking by default. Code is split with tree-sitter definition boundaries when the language parser succeeds, with regex chunking as the fallback for unsupported files.
  • Local by default. The default model2vec backend runs code-specialized static embeddings on CPU (with fastembed as a transformer fallback for prose/docs); your code never leaves your machine and no API keys are needed.
  • Memory across sessions. repo_rag_remember lets agents persist architectural decisions, gotchas, and invariants that survive rag rebuild.
  • Background-mode git hooks. Re-indexing happens off the critical path with truncating per-run logs you can --follow.
  • Hardware-aware throttling. On Windows, BELOW_NORMAL_PRIORITY_CLASS plus non-P-core affinity keeps your laptop responsive while indexing.

Supported agents

Agent Rules MCP auto-write Docs
Factory Droid ~/.factory/AGENTS.md, <repo>/AGENTS.md yes docs/clients/factory.md
Claude (CLI + Desktop) ~/.claude/CLAUDE.md, <repo>/CLAUDE.md yes (CLI + Desktop) docs/clients/claude.md
Codex CLI/Desktop ~/.codex/AGENTS.md, <repo>/AGENTS.md yes (TOML) docs/clients/codex.md
Cursor ~/.cursor/rules/repo-rag.mdc, <repo>/.cursor/rules/repo-rag.mdc yes docs/clients/cursor.md
Windsurf ~/.codeium/windsurf/global_rules.md, <repo>/.windsurfrules yes docs/clients/windsurf.md
Cline <repo>/AGENTS.md manual (VS Code settings) docs/clients/cline.md
Continue.dev ~/.continue/AGENTS.md, <repo>/AGENTS.md yes docs/clients/continue.md
Gemini CLI ~/.gemini/GEMINI.md, <repo>/GEMINI.md yes docs/clients/gemini.md
Google Antigravity ~/.antigravity/AGENTS.md, <repo>/AGENTS.md yes docs/clients/antigravity.md
Aider ~/.aider/CONVENTIONS.md, <repo>/CONVENTIONS.md reference YAML docs/clients/aider.md
MiniMax Agent <repo>/AGENTS.md yes docs/clients/minimax.md
Zed <zed-config>/.rules, <repo>/.rules yes (context_servers) docs/clients/zed.md
OpenCode (CLI + Desktop) ~/.config/opencode/CLAUDE.md, <repo>/CLAUDE.md yes docs/clients/opencode.md
Universal (AGENTS.md) ~/.config/repo-rag/AGENTS.md, <repo>/AGENTS.md n/a docs/clients/universal.md

Run rag agents list for a live table of what is detected on your machine.

MCP tools

The server exposed by rag mcp-server advertises five tools (full reference in docs/mcp-tools.md):

Tool Purpose
repo_rag_search Primary hybrid search; use instead of Grep / ripgrep / Glob.
repo_rag_get_context Markdown context pack for a multi-step task.
repo_rag_remember Persist a durable note for future sessions.
repo_rag_forget Remove a note by id.
repo_rag_status Index health summary.

Read-only tools are annotated readOnlyHint=true, idempotentHint=true, openWorldHint=false so MCP clients can auto-approve them in strict trust modes.

Performance highlights

  • Very fast indexing with the default model2vec static model (no transformer forward pass); ~3-10 chunks/sec with the fastembed transformer fallback.
  • Embedding cache keyed by (provider, model, dim, sha256(content)) makes interrupted rebuilds resume cheaply and lets you switch providers without invalidating the unrelated rows.
  • --window-size, --pace-sec, --sequential, --full-speed, and --threads cover every tuning knob from "go as fast as possible" to "do not interfere with anything I am doing".

See docs/performance.md for the full guide.

Configuration

The global config lives at ~/.repo-rag/config.toml. Override per-repo at ~/.repo-rag/<repo-id>/config.toml. Every value can also be set with an environment variable (RAG_EMBEDDING_PROVIDER, REPO_RAG_INDEX_DIR, ...).

Before a large rebuild, preview the exact inventory and likely cache/embed work:

rag preview
rag preview --all --sort embed
rag preview /path/to/repo --json

For repo-only exclusions, initialize the repo config and edit its exclude_globs:

rag config repo-init
rag config show --path /path/to/repo

Full reference: docs/configuration.md.

Storage layout

~/.repo-rag/
  registry.json
  config.toml
  <repo_id>/
    metadata.sqlite      # files, chunks, FTS5, notes, embedding cache
    lancedb/             # vector store
    cache/
    logs/

Nothing is written inside your repo apart from optional AGENTS.md, CLAUDE.md, etc. (which you can .gitignore or commit, your choice).

Docker

docker pull ghcr.io/ramanan-bala/repo-rag:latest
{
  "mcpServers": {
    "repo-rag": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "-v", "~/.repo-rag:/data/.repo-rag",
        "ghcr.io/ramanan-bala/repo-rag:latest"
      ]
    }
  }
}

Troubleshooting

Common gotchas (Windows model load hang, AV interaction on corporate machines, hybrid-CPU thread tuning, MCP server PATH issues, etc.) live in docs/troubleshooting.md.

Contributing

Pull requests welcome. See CONTRIBUTING.md for setup, test, lint, and release-process notes. To add a new agent plugin, follow docs/development.md.

License

MIT. See LICENSE.

Code of Conduct: CODE_OF_CONDUCT.md. Security policy: SECURITY.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repo_rag-0.3.1.tar.gz (66.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repo_rag-0.3.1-py3-none-any.whl (72.7 kB view details)

Uploaded Python 3

File details

Details for the file repo_rag-0.3.1.tar.gz.

File metadata

  • Download URL: repo_rag-0.3.1.tar.gz
  • Upload date:
  • Size: 66.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for repo_rag-0.3.1.tar.gz
Algorithm Hash digest
SHA256 2387d32e1f8635ee6c82f3aa167e06a81e99d0c6fccf54619b6a3fc923d9b8b5
MD5 08f9ad211f4e8c1cfeec18aaffd903ee
BLAKE2b-256 4816dffdc1fbb0d9d7432f7c111364efb10450fe0f150d165f4992aa779c83c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for repo_rag-0.3.1.tar.gz:

Publisher: release.yml on Ramanan-Bala/rag-tool

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file repo_rag-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: repo_rag-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 72.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for repo_rag-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 eba2a94f0243fdb2772b095b88eb1eec6640bd13f2839242a16addd1ef84d5ec
MD5 a0144e09f2e9f47646ef3eb21b9a3a7f
BLAKE2b-256 70bb77aff84f679e54f523ccd9709a12afaccec1d8c26785544a1e923e90faf7

See more details on using hashes here.

Provenance

The following attestation bundles were made for repo_rag-0.3.1-py3-none-any.whl:

Publisher: release.yml on Ramanan-Bala/rag-tool

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page