Offline source code indexer with token-bounded context queries
Project description
grounded-index
Offline source code indexer with token-bounded context queries.
grounded-index walks a repository, parses sources with tree-sitter, and writes
files, symbols, imports, and references to a SQLite database. From there it answers
focused questions — what symbols exist, who calls them, which tests reference them,
what does this look like in 4 000 tokens — without your tools ever needing to read
the full repository.
It is designed for AI coding agents, code-review tooling, and bank-friendly review workflows: no network calls, no model calls in the indexing path, no source mutation.
Why
Language-model coding assistants waste tokens (and accuracy) when they have to cat whole
files to discover basic repository structure. grounded-index exposes the same
evidence a developer uses to navigate — symbols, callers, tests, imports — as
compact CLI output and a small Python API.
Features
- Seven languages on tree-sitter: Python, Rust, TypeScript, JavaScript, Java, C, C++.
- Symbols, imports, references extracted per language with stable line/col spans.
- Test detection (
is_testcolumn) by path heuristics, decorators / attributes, and naming conventions —pytesttest_*/Test*, Rust#[test]/#[cfg(test)], Jest*.test.ts/*.spec.ts, Java@Testannotations, C/C++_test.csuffix. - SQLite schema with FTS5 symbol search and proper indexes.
- Token-bounded context packs for AI prompts (
contextcommand +BudgetEnforcer). - JSON / Markdown / human output for each command.
- Read-only by default; indexing requires the explicit
--writeflag.
Install
pip install grounded-index
Or from a local checkout:
git clone <repo> grounded-index
cd grounded-index
pip install -e .
30-second example
# Index this repository
grounded-index --write index
# List symbols matching a name
grounded-index symbols --name parse
# Show who calls a symbol
grounded-index references --symbol parse_references --direction in
# Build a 4 000-token context pack
grounded-index context --symbol Indexer --budget 4000 --include-callers
Tier 2 — external CFG/ICFG tools (optional)
grounded-index ships two standalone scripts that produce bytecode-precise
control-flow graphs and inter-procedural call graphs by delegating to LLVM
(C/C++) and Soot (Java). Output is one JSON shape for all three languages.
| Script | Languages | Backend |
|---|---|---|
grounded_clang_cfg.py |
C, C++ | clang -emit-llvm + opt -passes='dot-callgraph,dot-cfg' + pydot |
grounded_java_cfg.py |
Java | javac + Soot 4.7.1 (BriefBlockGraph + CHA call graph) |
# Install the optional dependency
pip install grounded-index[external-cfg]
# C / C++ (requires clang + opt on PATH)
python grounded_clang_cfg.py src/calculator.c
# Java (requires JDK 11+; first run downloads Soot ~12 MB)
tools/download-soot.sh
python grounded_java_cfg.py src/Calculator.java --class-name Calculator
These tools are not imported by the core indexer — they're operator-facing utilities for ICFG-aware analyses. See MANUAL.md for usage and docs/external-cfg-tools.md for the tier model and limitations.
Documentation
| Doc | For |
|---|---|
| MANUAL.md | End-user / operator guide — full CLI reference, workflows, troubleshooting. |
| API.md | Programmatic Python API — Indexer, QueryEngine, parsers, schema. |
| docs/external-cfg-tools.md | Tier 2 external CFG/ICFG tools — design, dependencies, JSON schema, limitations. |
| CHANGELOG.md | Release history. |
| vision.md | Product principles and positioning. |
Status
Alpha — schema v2, seven languages, 195 tests. Tier 2 external CFG/ICFG tools ship alongside the indexer. CLI surface is stable enough for internal use; expect occasional additions before 1.0.
License
GPL-3.0-only. See pyproject.toml for the canonical declaration.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file grounded_index-0.5.0.tar.gz.
File metadata
- Download URL: grounded_index-0.5.0.tar.gz
- Upload date:
- Size: 90.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"CachyOS Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
620f5a1305db11d94bb917aafad994a0defcd9fa52510e75d2810421b61ea4b0
|
|
| MD5 |
f12b9180d254b59a5771c1ab6ea485e9
|
|
| BLAKE2b-256 |
5db0bf278c79d4f1018330c4ad94d1eb21f49a980a2eddb39e22eff8b8b25ccc
|
File details
Details for the file grounded_index-0.5.0-py3-none-any.whl.
File metadata
- Download URL: grounded_index-0.5.0-py3-none-any.whl
- Upload date:
- Size: 36.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"CachyOS Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25b6ad0b105bda472d5d82716c503405bbb7cfd7f78973d7f6953d0ea2417ad9
|
|
| MD5 |
893b0cf68ef9bff9e3bfa942bb539a44
|
|
| BLAKE2b-256 |
ac4846819c030b3e1c93d80ab2e89e7e12680e4a610cd1a18c390ae9f17d9dac
|