Skip to main content

Offline source code indexer with token-bounded context queries

Project description

grounded-index

Version Python License: GPL-3.0

Offline source code indexer with token-bounded context queries.

grounded-index walks a repository, parses sources with tree-sitter, and writes files, symbols, imports, and references to a SQLite database. From there it answers focused questions — what symbols exist, who calls them, which tests reference them, what does this look like in 4 000 tokens — without your tools ever needing to read the full repository.

It is designed for AI coding agents, code-review tooling, and bank-friendly review workflows: no network calls, no model calls in the indexing path, no source mutation.


Why

Language-model coding assistants waste tokens (and accuracy) when they have to cat whole files to discover basic repository structure. grounded-index exposes the same evidence a developer uses to navigate — symbols, callers, tests, imports — as compact CLI output and a small Python API.

Features

  • Seven languages on tree-sitter: Python, Rust, TypeScript, JavaScript, Java, C, C++.
  • Symbols, imports, references extracted per language with stable line/col spans.
  • Test detection (is_test column) by path heuristics, decorators / attributes, and naming conventions — pytest test_* / Test*, Rust #[test] / #[cfg(test)], Jest *.test.ts / *.spec.ts, Java @Test annotations, C/C++ _test.c suffix.
  • SQLite schema with FTS5 symbol search and proper indexes.
  • Token-bounded context packs for AI prompts (context command + BudgetEnforcer).
  • JSON / Markdown / human output for each command.
  • Read-only by default; indexing requires the explicit --write flag.

Install

pip install grounded-index

Or from a local checkout:

git clone <repo> grounded-index
cd grounded-index
pip install -e .

30-second example

# Index this repository
grounded-index --write index

# List symbols matching a name
grounded-index symbols --name parse

# Show who calls a symbol
grounded-index references --symbol parse_references --direction in

# Build a 4 000-token context pack
grounded-index context --symbol Indexer --budget 4000 --include-callers

Tier 2 — external CFG/ICFG tools (optional)

grounded-index ships two standalone scripts that produce bytecode-precise control-flow graphs and inter-procedural call graphs by delegating to LLVM (C/C++) and Soot (Java). Output is one JSON shape for all three languages.

Script Languages Backend
grounded_clang_cfg.py C, C++ clang -emit-llvm + opt -passes='dot-callgraph,dot-cfg' + pydot
grounded_java_cfg.py Java javac + Soot 4.7.1 (BriefBlockGraph + CHA call graph)
# Install the optional dependency
pip install grounded-index[external-cfg]

# C / C++ (requires clang + opt on PATH)
python grounded_clang_cfg.py src/calculator.c

# Java (requires JDK 11+; first run downloads Soot ~12 MB)
tools/download-soot.sh
python grounded_java_cfg.py src/Calculator.java --class-name Calculator

These tools are not imported by the core indexer — they're operator-facing utilities for ICFG-aware analyses. See MANUAL.md for usage and docs/external-cfg-tools.md for the tier model and limitations.

Documentation

Doc For
MANUAL.md End-user / operator guide — full CLI reference, workflows, troubleshooting.
API.md Programmatic Python API — Indexer, QueryEngine, parsers, schema.
docs/external-cfg-tools.md Tier 2 external CFG/ICFG tools — design, dependencies, JSON schema, limitations.
CHANGELOG.md Release history.
vision.md Product principles and positioning.

Status

Alpha — schema v2, seven languages, 195 tests. Tier 2 external CFG/ICFG tools ship alongside the indexer. CLI surface is stable enough for internal use; expect occasional additions before 1.0.

License

GPL-3.0-only. See pyproject.toml for the canonical declaration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grounded_index-0.5.0.tar.gz (90.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

grounded_index-0.5.0-py3-none-any.whl (36.7 kB view details)

Uploaded Python 3

File details

Details for the file grounded_index-0.5.0.tar.gz.

File metadata

  • Download URL: grounded_index-0.5.0.tar.gz
  • Upload date:
  • Size: 90.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"CachyOS Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for grounded_index-0.5.0.tar.gz
Algorithm Hash digest
SHA256 620f5a1305db11d94bb917aafad994a0defcd9fa52510e75d2810421b61ea4b0
MD5 f12b9180d254b59a5771c1ab6ea485e9
BLAKE2b-256 5db0bf278c79d4f1018330c4ad94d1eb21f49a980a2eddb39e22eff8b8b25ccc

See more details on using hashes here.

File details

Details for the file grounded_index-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: grounded_index-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 36.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"CachyOS Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for grounded_index-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 25b6ad0b105bda472d5d82716c503405bbb7cfd7f78973d7f6953d0ea2417ad9
MD5 893b0cf68ef9bff9e3bfa942bb539a44
BLAKE2b-256 ac4846819c030b3e1c93d80ab2e89e7e12680e4a610cd1a18c390ae9f17d9dac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page