Lightning-fast, deterministic repository indexing and retrieval for AI agents
Project description
repo-index: AST-Aware Repository Intelligence
repo-index is a lightning-fast, local-first repository indexing and retrieval daemon. It serves as Layer 1 (Structural Intelligence) of the AI Infra Layer, utilizing Tree-sitter for robust Abstract Syntax Tree (AST) parsing, SQLite + FTS5 for structured storage and full-text search, and NetworkX for call graph analysis.
✨ Key Features
- AST-Powered Indexing: Extracts precise symbol definitions (functions, classes, methods, modules) and their exact line ranges across Python codebases.
- Relational Call Graphs: Tracks
CALLSandIMPORTSrelationships between symbols to understand codebase topology. - Incremental & Branch-Aware: Uses SHA-1 content hashing and git branch tracking to update only what changes, avoiding expensive full re-indexing.
- Live Filesystem Watcher: Background daemon that instantly updates the AST index on file CREATED, MODIFIED, DELETED, or MOVED events.
- Rich CLI & Retrieval Engine: Provides beautiful, terminal-native outputs for dependency tracing, blast radius analysis, and LLM context assembly.
🛠️ Installation
Ensure you have Python 3.10+ installed.
Via pip (Recommended)
pip install repo-index
From source (for development)
git clone https://github.com/aryanwalia/ai-infra.git
cd ai-infra/repo-index
pip install -e .
Database Location
By default, repo-index stores its SQLite database at:
~/.local/share/repo-index/index.db
(You can override this location using the --db CLI option or the REPO_INDEX_DB environment variable).
💻 CLI Reference
repo-index provides a comprehensive suite of commands for indexing, querying, and analyzing your codebase.
1. Index Management & Daemon
repo-index build [PATH] [--db PATH]Scans and indexes a repository root. Calculates file content hashes, parses ASTs, and populates the SQLite index.repo-index watch [PATH] [--skip-build] [--db PATH]Starts the live filesystem watcher daemon. Listens for changes and incrementally updates the index in milliseconds.repo-index statsDisplays overall index statistics, including total files, symbols, and relations broken down by kind and language.repo-index branch [PATH]Shows the currently active git branch and displays per-branch index statistics (files and symbols indexed per branch).repo-index filesLists all indexed files along with their detected language and SHA-1 content hash.
2. Query & Inspection
repo-index search <query> [--kind function/class/method/module] [--limit N]Performs an FTS5 BM25 full-text search across all indexed symbol names and file paths.repo-index symbol <name>Looks up exact symbol details, displaying its AST kind, file path, line ranges, and language.repo-index callers <name>Finds all direct callers of a specific function or method across the codebase.repo-index imports <file_path>Lists all module-level imports within a specific file.
3. Advanced AI & Graph Retrieval
repo-index deps <name> [--depth N]Performs a Breadth-First Search (BFS) forward through the call graph to show what<name>transitively calls up to depthN.repo-index impact <name> [--depth N]Performs a BFS backward through the call graph to analyze the blast radius (what transitively calls<name>up to depthN).repo-index context <name> [--depth N]The Crown Jewel for AI Agents: Assembles the full retrieval context for a symbol (direct calls, callers, file imports, transitive call graph, and blast radius) into a single, structured summary ready to be injected into LLM prompts.
🗄️ Database Schema
The underlying SQLite database is fully optimized with WAL journal mode and foreign key constraints.
erDiagram
FILES ||--o{ SYMBOLS : contains
SYMBOLS ||--o{ RELATIONS : "from_id (caller)"
SYMBOLS ||--o{ RELATIONS : "to_name (callee)"
FILES {
string path PK
string content_hash
string branch
string language
int last_indexed_at
}
SYMBOLS {
int id PK
string name
string kind
string file_path FK
int start_line
int end_line
string hash
string language
}
RELATIONS {
int id PK
int from_id FK
string relation
string to_name
}
META {
string key PK
string value
}
symbols_fts(Virtual Table): FTS5 table overname,kind, andfile_path. Automatically kept in sync with thesymbolstable via SQLiteAFTER INSERTandAFTER DELETEtriggers.meta: Key-value store maintaining repository state, such ascurrent_branch.
🐍 Python API Reference
You can import repo-index directly into custom Python scripts, AI orchestration workflows, or MCP servers:
from pathlib import Path
from repo_index import db, retrieval, graph
# 1. Connect to the SQLite Index DB
conn = db.open_db(Path.home() / ".local/share/repo-index/index.db")
# 2. Perform an FTS5 Search
results = retrieval.search(conn, query="auth", kind="function", limit=5)
for r in results:
print(f"Found {r.kind} {r.name} at {r.file_path}:{r.start_line}")
# 3. Assemble Full LLM Retrieval Context
ctx = retrieval.get_context(conn, name="open_db", callgraph_depth=2)
if ctx:
print(f"Symbol: {ctx.name} ({ctx.kind}) in {ctx.file_path}")
print(f"Direct Calls: {ctx.calls}")
print(f"Called By: {ctx.called_by}")
print(f"File Imports: {ctx.file_imports}")
print(f"Transitive Call Graph: {ctx.callgraph}")
print(f"Blast Radius (Impact): {ctx.impact}")
# 4. Advanced Graph Traversal
G = graph.build_call_graph(conn)
# G is a NetworkX DiGraph where edges represent CALLS relations.
# Use standard NetworkX algorithms for custom architectural analysis.
🧪 Development & Testing
The codebase includes a robust pytest test suite covering AST parsing, database operations, incremental watching, and graph retrieval.
# Run the full test suite
pytest tests/ -v
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file jude_repo_index-0.1.0.tar.gz.
File metadata
- Download URL: jude_repo_index-0.1.0.tar.gz
- Upload date:
- Size: 43.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b9dc15739e18461fd7f2c810a4f40b5352c3d670a2bf47f9ed3955b71e54bb6f
|
|
| MD5 |
1fff12912a8b945d1aa83be5f9c736bc
|
|
| BLAKE2b-256 |
1ed0da7a65df42265ad8ed7bdedd830d7a931e776380911bd3d12a9f295621c5
|
Provenance
The following attestation bundles were made for jude_repo_index-0.1.0.tar.gz:
Publisher:
publish-repo-index.yml on aryanwalia2003/jude
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
jude_repo_index-0.1.0.tar.gz -
Subject digest:
b9dc15739e18461fd7f2c810a4f40b5352c3d670a2bf47f9ed3955b71e54bb6f - Sigstore transparency entry: 1841163828
- Sigstore integration time:
-
Permalink:
aryanwalia2003/jude@2a6fb5184d1d4efe43aeaaa2a987f319cf23d0e6 -
Branch / Tag:
refs/tags/repo-index-v0.1.0 - Owner: https://github.com/aryanwalia2003
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-repo-index.yml@2a6fb5184d1d4efe43aeaaa2a987f319cf23d0e6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file jude_repo_index-0.1.0-py3-none-any.whl.
File metadata
- Download URL: jude_repo_index-0.1.0-py3-none-any.whl
- Upload date:
- Size: 42.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba77b6c31706e63fe757c9d1d70897387b653cb0b4820aee6185365d5a6ca646
|
|
| MD5 |
6de50fed0ebe5bab711c19b3ca9ba2f7
|
|
| BLAKE2b-256 |
08f80bbfc47d0c0d35175132b98fc39ff00fc5eeda032144a70c11d9e8da2ddf
|
Provenance
The following attestation bundles were made for jude_repo_index-0.1.0-py3-none-any.whl:
Publisher:
publish-repo-index.yml on aryanwalia2003/jude
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
jude_repo_index-0.1.0-py3-none-any.whl -
Subject digest:
ba77b6c31706e63fe757c9d1d70897387b653cb0b4820aee6185365d5a6ca646 - Sigstore transparency entry: 1841163854
- Sigstore integration time:
-
Permalink:
aryanwalia2003/jude@2a6fb5184d1d4efe43aeaaa2a987f319cf23d0e6 -
Branch / Tag:
refs/tags/repo-index-v0.1.0 - Owner: https://github.com/aryanwalia2003
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-repo-index.yml@2a6fb5184d1d4efe43aeaaa2a987f319cf23d0e6 -
Trigger Event:
push
-
Statement type: