Skip to main content

Tree-sitter backed symbol search and inspection for codebases.

Project description

code-symbol-index

Tree-sitter backed symbol index and code navigation for tools that need fast, bounded, LLM-friendly answers over a local codebase.

It provides a small Python API and a single CLI command:

code-symbol-index

The default CLI output is readable text. Add --json on query commands when a machine-readable response is better.

Features

  • Disk-backed SQLite index at .code-symbol-index/index.sqlite
  • Incremental indexing by mtime_ns + size
  • .gitignore aware file discovery
  • UTF-8 text file filtering
  • Mainstream language parsing through tree-sitter-language-pack
  • Symbol search, inspect, references, implementors, file outline, and index status
  • Bounded outputs designed for coding LLM context windows

This is syntactic code navigation, not a language server. It does not provide type-aware rename safety or full semantic call graph accuracy.

Install

Install the CLI as a uv tool:

uv tool install code-symbol-index

Or install from a local checkout:

uv tool install .

For local development with editable imports and tests:

uv venv .venv
uv pip install --python .venv/bin/python -e '.[dev]'

Then:

code-symbol-index --version

Quick Start

Build or refresh the index:

code-symbol-index index --root /path/to/repo

Check whether indexed tools are available:

code-symbol-index status --root /path/to/repo
code-symbol-index status --root /path/to/repo --check

Search symbols:

code-symbol-index search Tool --root /path/to/repo --limit 20
code-symbol-index search Tool Agent Runner --root /path/to/repo
code-symbol-index search Tool --root /path/to/repo --kind class,function --path src --exact-only

Inspect one symbol:

code-symbol-index inspect Tool --root /path/to/repo
code-symbol-index inspect Tool.method_name --root /path/to/repo
code-symbol-index inspect Tool --root /path/to/repo --anchors

Outline a file:

code-symbol-index outline src/app.py --root /path/to/repo
code-symbol-index outline src/app.py --root /path/to/repo --symbol Tool

CLI

code-symbol-index languages
code-symbol-index --version
code-symbol-index version
code-symbol-index index --root /path/to/repo
code-symbol-index status --root /path/to/repo
code-symbol-index status --root /path/to/repo --check
code-symbol-index status --root /path/to/repo --check --max-pending-files 20
code-symbol-index search Tool --root /path/to/repo
code-symbol-index search Tool Agent Runner --root /path/to/repo
code-symbol-index search Tool --root /path/to/repo --kind class,function --path src --exact-only
code-symbol-index inspect Tool --root /path/to/repo
code-symbol-index inspect Tool --root /path/to/repo --path src --exact-only
code-symbol-index inspect Tool --root /path/to/repo --anchors
code-symbol-index outline src/app.py --root /path/to/repo
code-symbol-index outline src/app.py --root /path/to/repo --symbol Tool
code-symbol-index refs Tool --root /path/to/repo --limit 20 --offset 0
code-symbol-index impls Greeter --root /path/to/repo --kind trait --limit 20 --offset 0
code-symbol-index clean --root /path/to/repo
code-symbol-index install-skill

JSON is available for structured consumers:

code-symbol-index search Tool --root /path/to/repo --json
code-symbol-index inspect Tool --root /path/to/repo --json
code-symbol-index inspect Tool --root /path/to/repo --anchors --json
code-symbol-index outline src/app.py --root /path/to/repo --json
code-symbol-index refs Tool --root /path/to/repo --json
code-symbol-index impls Tool --root /path/to/repo --json
code-symbol-index status --root /path/to/repo --json

Output Formats

Search returns candidates only, never source:

query: Tool
count: 2
limit: 20
has_more: false

symbols:
  - id: python:class:Tool:nanocode.py:1284:1330
    name: Tool
    kind: class
    file: nanocode.py
    range: 1284:1330
    signature: class Tool:
    score: exact
    language: python

For multiple search queries:

queries:
  - Tool
  - Agent
count: 2
limit: 20
has_more: false

symbols:
  - id: python:class:Tool:nanocode.py:1284:1330
    name: Tool
    kind: class
    file: nanocode.py
    range: 1284:1330
    signature: class Tool:
    score: exact
    matched_query: Tool

Inspect returns bounded source with stable 0-based line ranges:

symbol:
  id: python:function:foo:src/app.py:120:123
  name: foo
  kind: function
  file: src/app.py
  range: 120:123
  signature: def foo():
summary:
  imports: 2
  members: 0
  callers: 1
  callees: 1
  references: 3
  implementors: 0
imports:
  - range: 0:1
    statement: import os
source:
  status: full
  range: 120:123
  shown_range: 120:123
  total_lines: 3

  120 |def foo():
  121 |    if ok:
  122 |        return 1

Use inspect --anchors or inspect_text(..., anchors=True) to emit hashline source anchors from the current file contents:

source:
  status: full
  range: 120:123
  shown_range: 120:123
  total_lines: 3
  note: Use line:hash as edit anchor; code starts after |

120:a1b2c3d4|def foo():
121:d4e5f6a7|    if ok:
122:f6a7b8c9|        return 1

JSON inspect with anchors=True includes source_anchor with path, start_line, end_line, start_anchor, end_anchor, and lines[{line, hash, text}]. Hashes are computed from current file contents at output time.

Outline returns file structure without source or ids:

file: nanocode.py
range: 0:9060
count: 142

outline:
1284:1330 | class Tool:
1289:1292 |     def cli_args(cls, args):
1312:1325 |     def tool_schema(cls):
9023:9060 | def main(argv=None):

Status is fast by default and does not scan the directory:

index:
  status: ready
  root: /path/to/repo
  files: 128
  symbols: 4820
  languages: python, typescript
  language_breakdown:
    - python: 80 files (62.5%)
    - typescript: 48 files (37.5%)
  pending_changes: unknown

Use --check to scan the directory and compute staleness:

index:
  status: stale
  root: /path/to/repo
  files: 128
  symbols: 4820
  pending_changes: 3
  pending_files:
    - src/app.py
    - src/new_feature.py
  reason: files changed after last index update

pending_files is bounded by --max-pending-files and is only computed with --check.

Query Rules

inspect accepts only symbol-like input:

  • ClassName
  • function_name
  • ClassName.method_name
  • symbol_prefix

It rejects natural language, file paths, and directory paths. Use outline for file paths.

search accepts A|B|C as a non-regex OR shorthand. --kind accepts one kind or comma-separated kinds, --path filters to a file or directory, and --exact-only disables prefix/fuzzy matches. The same filters are available in the Python API as kind=, path=, and exact_only=True.

Python indexes top-level constants, top-level variables, and top-level dictionary keys as symbols. Dictionary keys use kind=dict_key and the parent assignment as container.

All line ranges are start:end, 0-based, with end exclusive.

Python API

import code_symbol_index as csi

csi.index("/path/to/repo")
csi.update(["src/app.py", "src/lib.py"], root="/path/to/repo")

print(csi.status_text("/path/to/repo"))
print(csi.search_text("Tool", root="/path/to/repo"))
print(csi.search_text("Tool|Agent", root="/path/to/repo", kind="class,function", path="src"))
print(csi.inspect_text("Tool", root="/path/to/repo"))
print(csi.inspect_text("Tool", root="/path/to/repo", path="src", exact_only=True))
print(csi.inspect_text("Tool", root="/path/to/repo", anchors=True))
print(csi.outline_text("src/app.py", root="/path/to/repo"))
print(csi.outline_text("src/app.py", root="/path/to/repo", symbol="Tool"))

symbols = csi.search("Tool", root="/path/to/repo", format="object")
symbols = csi.search(["Tool", "Agent", "Runner"], root="/path/to/repo")
search_payload = csi.search("Tool", root="/path/to/repo", format="json")
search_text = csi.search("Tool", root="/path/to/repo", format="text")
inspection = csi.inspect("Tool", root="/path/to/repo")
anchored = csi.inspect("Tool", root="/path/to/repo", format="json", anchors=True)
references = csi.refs("Tool", root="/path/to/repo", limit=20, offset=0)

For repeated queries, reuse a repository handle:

repo = csi.Repository("/path/to/repo")
repo.update(["src/app.py"])
print(repo.search_text("Tool"))
print(repo.inspect_text("Tool"))
print(repo.outline_text("src/app.py"))

Refresh and update accept an optional progress callback:

def on_progress(event, *, done=0, total=0, path=None):
    print(event, done, total, path)

repo = csi.Repository("/path/to/repo", progress=on_progress)
repo.refresh()
repo.update(["src/app.py"], progress=on_progress)

Stable progress events are scan, start, file, and finish.

To refresh the index during application startup without blocking startup:

thread = csi.refresh_async("/path/to/repo", progress=on_progress)

refresh_async creates its own Repository inside the background thread. Do not share a Repository instance across threads.

Queries require an existing index. Run code-symbol-index index or code_symbol_index.index() first. Queries do not sync automatically unless called with --sync or sync=True. After external file edits, call code_symbol_index.update(paths, root=...) or Repository.update(paths) to refresh only those files; deleted or newly ignored paths are removed from the index.

Install the Codex skill:

code-symbol-index install-skill
code-symbol-index install-skill --codex-home ~/.codex --force

The command writes SKILL.md to $CODEX_HOME/skills/code-symbol-index/, or ~/.codex/skills/code-symbol-index/ when CODEX_HOME is not set.

Top-level query APIs accept format="object" | "text" | "json":

  • object returns Python dataclasses/lists and is the default.
  • text returns the same readable format as the *_text helpers.
  • json returns JSON-safe Python dict/list data.

search accepts one query, A|B|C, or a list of symbol names/prefixes. Multiple queries are OR-ed, are not regexes, and share one total limit. Search text and JSON formats include has_more when more matches exist beyond limit.

Development

make install
make check
make smoke
make clean

Python API List

Index lifecycle:

  • index(root=".", *, language=None, progress=None) -> Repository
  • update(paths, *, root=".", language=None, progress=None) -> Repository
  • refresh_async(root=".", *, language=None, db_path=None, progress=None, daemon=True) -> threading.Thread
  • install_skill(*, target="codex", codex_home=None, force=False) -> Path
  • clean(root=".") -> None
  • status(root=".", *, language=None, db_path=None, check=False, max_pending_files=50, format="object") -> IndexStatus | str | dict
  • status_text(root=".", *, language=None, db_path=None, check=False, max_pending_files=50) -> str

Queries:

  • search(query: str | list[str], *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, sync=False, format="object") -> list[Symbol] | str | dict
  • search_text(query: str | list[str], *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, sync=False) -> str
  • inspect(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, anchors=False, sync=False, format="object", ...) -> Inspection | str | dict
  • inspect_text(query, *, root=".", kind=None, language=None, path=None, exact_only=False, anchors=False, sync=False, ...) -> str
  • refs(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, offset=0, sync=False, format="object") -> Page | str | dict
  • impls(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, offset=0, sync=False, format="object") -> Page | str | dict
  • outline(path, *, root=".", symbol=None, max_symbols=200, sync=False, format="object") -> Page | str | dict
  • outline_text(path, *, root=".", symbol=None, max_symbols=200, sync=False) -> str

Repository handle:

  • Repository(root=".", *, languages=None, include=None, exclude=None, db_path=None)
  • Repository.refresh(*, progress=None) -> Repository
  • Repository.update(paths=None, *, progress=None) -> Repository
  • Repository.search(...), search_text(...)
  • Repository.inspect(...), inspect_text(...)
  • Repository.refs(...), impls(...)
  • Repository.outline(...), outline_text(...)
  • Repository.clean() -> None

Data classes:

  • Symbol
  • Reference
  • Page
  • Inspection
  • InspectOptions
  • IndexStatus

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_symbol_index-0.1.11.tar.gz (40.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_symbol_index-0.1.11-py3-none-any.whl (32.9 kB view details)

Uploaded Python 3

File details

Details for the file code_symbol_index-0.1.11.tar.gz.

File metadata

  • Download URL: code_symbol_index-0.1.11.tar.gz
  • Upload date:
  • Size: 40.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for code_symbol_index-0.1.11.tar.gz
Algorithm Hash digest
SHA256 4a8f0edc36032d3205f34c44a89fadf588a2c9050b9ff3170a079852ce5bf465
MD5 4fb866995d5cd4ce2690a0ad6b6f9b1d
BLAKE2b-256 42bed55aaeb6ae1827b990b15d953eb081fbbf2c81913e50f1cdba1404c8fe40

See more details on using hashes here.

File details

Details for the file code_symbol_index-0.1.11-py3-none-any.whl.

File metadata

  • Download URL: code_symbol_index-0.1.11-py3-none-any.whl
  • Upload date:
  • Size: 32.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for code_symbol_index-0.1.11-py3-none-any.whl
Algorithm Hash digest
SHA256 fa527526a41fae0ac7f73d90a4dcc0d0dbe43f2192ea829b7996b72db2c1936d
MD5 b98d14b7980bf4ac8887a89b718e1d1a
BLAKE2b-256 44efc68e5e1a6e90f6b815945cfad806e91ab99b2c8a5bb63b4c891c9e4ae5b4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page