Skip to main content

Local-first codebase context engine — ask plain English questions about any Python codebase

Project description

repolix

Ask plain English questions about any Python codebase. Get answers with exact file and line citations. Runs entirely on your machine.

$ repolix index ./myrepo
Indexing /path/to/myrepo
Indexing  100% ████████████████████ 24/24
╭──────── Index Complete ─────────╮
│ Files found:    24              │
│ Files indexed:  22              │
│ Files skipped:  2 (unchanged)   │
│ Chunks stored:  183             │
╰─────────────────────────────────╯

$ repolix query "how does authentication work"
Searching...
Generating answer...
╭──────────────────────── Answer ──────────────────────────╮
│ authenticate_user() in auth/validators.py validates       │
│ credentials by calling validate_token() [1], which checks │
│ expiry and signature. On success it creates a session via │
│ SessionService.create() [2].                              │
╰───────────────────────────────────────────────────────────╯
──────────────────────── Citations ────────────────────────
  [1] auth/validators.py:14-28  (validate_token)
  [2] auth/session.py:45-67     (SessionService.create)

confidence: high

Your code never leaves your machine. No server. No accounts beyond an OpenAI API key.


Quickstart

Requirements

Node.js is not required for end users. The web UI is pre-built and bundled inside the package.

Install

pip install repolix

Set your API key

export OPENAI_API_KEY=sk-your-key-here
# or add it to a .env file in your working directory

Index a repo

repolix index ./path/to/repo

Ask a question

repolix query "how does authentication work"

# Raw chunks without LLM (useful for debugging retrieval)
repolix query "where is UserService defined" --no-llm

# Force re-index all files, not just changed ones
repolix index ./path/to/repo --force

Web UI

uvicorn repolix.api:app --port 8000
# Open http://localhost:8000

Why repolix

Getting dropped into an unfamiliar codebase is painful. Documentation is outdated. Grep finds strings, not meaning. LLM chatbots hallucinate file names and function signatures because they have no access to your actual code.

repolix indexes your code locally using AST-based chunking — every retrieved chunk is a complete function or class, never an arbitrary line slice. It runs entirely on your machine.


How it works

1. AST chunking Tree-sitter parses each file into a syntax tree. repolix splits only at function and class boundaries — every chunk is semantically complete. Methods are tracked with their parent class for disambiguation.

2. Hybrid search Queries run against OpenAI embeddings (vector search) and exact token matching (keyword search) simultaneously. Results are merged using Reciprocal Rank Fusion, a ranking algorithm that rewards consistency across search methods over dominance in just one.

3. Call graph expansion After initial retrieval, repolix inspects each chunk's call graph and fetches called functions that didn't rank highly on their own. This surfaces implementation details that live one function call away from the entry point.

4. Metadata re-ranking Retrieved chunks are re-ranked using function names, file paths, docstrings, and call graph signals before being sent to the LLM.

5. Cited answers The top chunks go to the LLM with instructions to answer directly and cite every claim. Citations map back to exact file paths and line numbers.


Output format

Each query produces:

  • A prose answer with inline citations [1], [2], etc.
  • A citations section with exact file paths and line ranges. Citations marked [truncated] mean the function exceeded the 300-token chunk cap.
  • A confidence label (high / medium / low) based on how strongly the retrieved chunks matched the query across function names, file paths, docstrings, and call graph signals.

Cost

Action Approximate cost
Index a 30k-line repo ~$0.02 (one-time)
Re-index after a small change ~$0.001 (changed files only)
Each query ~$0.001

Incremental indexing means only changed files are re-embedded on subsequent runs.


Stack

Layer Choice
AST parsing Tree-sitter
Embeddings text-embedding-3-small
Vector store ChromaDB (local, no server needed)
LLM gpt-5.4-mini
Backend FastAPI
Frontend React + TypeScript
CLI Click + Rich

Install from source

git clone https://github.com/TheAsianFish/repolix
cd repolix
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

For frontend development (requires Node.js 18+):

cd frontend && npm install && cd ..
bash start.sh
# Backend: http://localhost:8000  |  Frontend: http://localhost:3000

Limitations

  • Python repos only (TypeScript support planned for V2)
  • Best on repos up to ~30k lines
  • Deeply nested functions are included in their parent chunk
  • Large functions (>300 tokens) are truncated at the chunk cap
  • Complex cross-file reasoning may require rephrasing the query

Roadmap

V2 — TypeScript/JavaScript support, VS Code extension, dependency graph visualization

V3 — GitHub webhook re-indexing, multi-repo support, Slack bot


Contributing

Bug reports and pull requests are welcome. Please open an issue before submitting a large change so we can discuss the approach.


License

MIT © 2026 Patrick Chung

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repolix-0.1.2.tar.gz (236.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repolix-0.1.2-py3-none-any.whl (228.2 kB view details)

Uploaded Python 3

File details

Details for the file repolix-0.1.2.tar.gz.

File metadata

  • Download URL: repolix-0.1.2.tar.gz
  • Upload date:
  • Size: 236.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for repolix-0.1.2.tar.gz
Algorithm Hash digest
SHA256 f3a100e4d0a6b9935874f511a538f5b61ee4bd04baba664ba1f13cc6451e0a59
MD5 28eb4f968dd40adca0db50d917b698cf
BLAKE2b-256 c7523b949b24e2e03a40e00c1f380d59b7e6b6c6f8b319975570dd4c766a5544

See more details on using hashes here.

File details

Details for the file repolix-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: repolix-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 228.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for repolix-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 637f8d98244f00d302d242b06361d7ad43157659fbd319617a51a9c456847f6b
MD5 bcc2b0214ca472e06446e0545ed73497
BLAKE2b-256 602502856cf3f91bf411f429aaec569ca19a3c623d0d8987434e8cd7437796a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page