Skip to main content

Local-first codebase-aware AI: index any repo and ask questions about it.

Project description

RepoScope

A local-first tool for querying codebases in plain English. Point it at any repository, build an index, and ask questions — no cloud sync, no background services, no mandatory API keys.


How it works

Without --embed (default):

repo → file discovery → chunking → JSON index → BM25 retrieval → LLM answer

With --embed (semantic search):

repo → file discovery → chunking → JSON index + .npy embeddings → BM25 + vector → RRF merge → LLM answer

RepoScope walks your project and breaks files into chunks — by function, class, or method boundary for Python, JS/TS, and C# (using regex-based detection), and by fixed line windows for everything else. Chunks are stored in a local JSON index and ranked with BM25 scoring at query time. If you run index --embed, it also generates sentence embeddings and merges the two results with Reciprocal Rank Fusion for better semantic matches.

If an LLM key is configured, ask feeds the top-ranked chunks to the model and returns a cited answer. If not, it falls back to showing the top matches with previews.


Installation

RepoScope is not yet on PyPI. Install from source:

git clone https://github.com/nirakar24/RepoScope.git
cd RepoScope
pip install -e .

To add optional features:

pip install -e ".[embed]"   # semantic search (downloads ~80 MB model on first use)
pip install -e ".[claude]"  # Anthropic Claude for LLM answers
pip install -e ".[gemini]"  # Google Gemini for LLM answers
pip install -e ".[openai]"  # OpenAI for LLM answers and embeddings
pip install -e ".[api]"     # FastAPI server
pip install -e ".[all]"     # everything

Setting up an LLM API key

index and search work with no API key. ask needs one to generate answers.

Interactive setup (recommended)

repointel configure

Prompts for your provider and key (input is hidden), then saves to ~/.config/repointel/.env. Picked up automatically in every session and directory from then on.

Which provider do you want to use for 'ask' answers?
  1) Anthropic Claude
  2) Google Gemini
  3) OpenAI

Enter 1, 2, or 3: 2

Google Gemini API key (input hidden):
Saved GEMINI_API_KEY to ~/.config/repointel/.env

Environment variable

Add to your shell profile (~/.bashrc, ~/.zshrc, etc.):

export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="AIza..."
export OPENAI_API_KEY="sk-..."

.env file

Create a .env in your project directory or any parent. RepoScope walks up from the current directory automatically:

GEMINI_API_KEY=AIza...

Provider priority and model overrides

If multiple keys are present, the order is: Claude → Gemini → OpenAI.

Override the default model with environment variables:

REPOSCOPE_CLAUDE_MODEL=claude-sonnet-4-6
REPOSCOPE_GEMINI_MODEL=gemini-2.5-flash
REPOSCOPE_OPENAI_MODEL=gpt-4.1-mini

Where to get keys

Provider Free tier Key page
Google Gemini Yes https://aistudio.google.com/apikey
Anthropic Claude No https://console.anthropic.com
OpenAI No https://platform.openai.com/api-keys

Quick start

# Index your project
repointel index /path/to/project

# Search (instant, no LLM)
repointel search "where is authentication handled"

# Ask a question (requires an LLM key)
repointel ask "how does the database schema relate to the API routes?"

Commands

configure

Interactive first-time setup. Saves your LLM API key to ~/.config/repointel/.env.

repointel configure

index

Walks a directory, chunks its files, and writes a JSON index.

repointel index /path/to/project

Add --embed to generate sentence embeddings alongside the index. Once present, search and ask automatically switch to hybrid retrieval — no extra flag needed at query time.

repointel index /path/to/project --embed

Use --index-file (before the subcommand) to control where the index is written. Useful for keeping separate indexes per project:

repointel --index-file .reposcope/backend.json index ./backend --embed

The default path is .reposcope/index.json in the current directory.


search

Retrieves the most relevant chunks for a query. Instant — no network call.

Uses BM25 by default. Automatically switches to hybrid BM25 + vector search if embeddings exist for the current index.

repointel search "JWT token validation"
repointel search "database migration" --top-k 5
repointel search "controller routes" --json
Flag Default Description
--top-k 8 Number of results to return
--json off Emit results as a JSON array

ask

Retrieves top chunks and sends them to an LLM for a cited answer.

repointel ask "how does authentication work?"
repointel ask "what entities exist in the database?" --top-k 12

Falls back to listing top matches with text previews if no LLM key is set.


stats

Prints a breakdown of the current index.

repointel stats
{
  "files_indexed": 72,
  "chunks_indexed": 428,
  "languages": { "csharp": 160, "javascript": 25, "json": 221 },
  "kinds": { "method": 117, "block": 247, "class": 25 },
  "embeddings": ".reposcope/index.npy"
}

Multiple projects

Use --index-file to maintain separate indexes. The flag goes before the subcommand.

repointel --index-file .reposcope/frontend.json index ./frontend
repointel --index-file .reposcope/backend.json  index ./backend

repointel --index-file .reposcope/frontend.json ask "how is routing configured?"
repointel --index-file .reposcope/backend.json  ask "what database tables exist?"

Optional REST API

pip install -e ".[api]"
uvicorn reposcope.api:app --reload
Method Endpoint Body
POST /index { "path": "/abs/path/to/repo", "embed": false }
POST /search { "query": "...", "top_k": 8 }
POST /ask { "query": "...", "top_k": 8 }
GET /stats

Docs at http://localhost:8000/docs.


Supported languages

Language Chunking
Python Regex-based: splits at def / async def / class boundaries
JavaScript / TypeScript / JSX / TSX Regex-based: splits at function, class, and arrow function boundaries
C# Regex-based: splits at class, interface, record, struct, and method boundaries
SQL, Markdown, JSON, YAML, TOML, CSS, SCSS, HTML, Dockerfile, Makefile Fixed 80-line windows with 15-line overlap

Files over 750 KB and generated lock files (package-lock.json, yarn.lock, pnpm-lock.yaml) are skipped.


Ignored directories

node_modules, .git, dist, build, bin, obj, .venv, __pycache__, .next, .nuxt, coverage, target, temp, vendor, and other standard build/cache directories are excluded automatically.


Roadmap

  • Tree-sitter chunking for true AST-level boundaries (replacing the regex approach)
  • Incremental re-indexing on file change
  • Cross-repo index merging for monorepos
  • Qdrant / Chroma backend for repositories with >50k chunks

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repointel-0.1.0.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repointel-0.1.0-py3-none-any.whl (20.3 kB view details)

Uploaded Python 3

File details

Details for the file repointel-0.1.0.tar.gz.

File metadata

  • Download URL: repointel-0.1.0.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for repointel-0.1.0.tar.gz
Algorithm Hash digest
SHA256 568da6b03f81f26311a64b37362c8ec9b87b62c49218d9486687278ad0224cfa
MD5 e6beb7e2feb55e17250ab8fdecd42838
BLAKE2b-256 66a9e5eb7fc5410ff667373c663d10280ab86223ca9d2e146c19dad2dabf0459

See more details on using hashes here.

File details

Details for the file repointel-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: repointel-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 20.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.5

File hashes

Hashes for repointel-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 daa4ae1c9600a6e513edd6f2ba879b2f187fbf985842e66c2e73fa123c0d461c
MD5 a7d3edb56ed93b0f15b7a76271a9488e
BLAKE2b-256 f8f9bd95a7bb196b3a24c616330b3cf9ab8303174bcc98472a2f3bf96321dd42

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page