Skip to main content

Tree-sitter backed symbol search and inspection for codebases.

Project description

code-symbol-index

Tree-sitter backed symbol index and code navigation for tools that need fast, bounded, LLM-friendly answers over a local codebase.

It provides a small Python API and a single CLI command:

code-symbol-index

The default CLI output is readable text. Add --json on query commands when a machine-readable response is better.

Features

  • Disk-backed SQLite index at .code-symbol-index/index.sqlite
  • Incremental indexing by mtime_ns + size
  • .gitignore aware file discovery
  • UTF-8 text file filtering
  • Mainstream language parsing through tree-sitter-language-pack
  • Symbol search, inspect, references, implementors, file outline, and index status
  • Bounded outputs designed for coding LLM context windows

This is syntactic code navigation, not a language server. It does not provide type-aware rename safety or full semantic call graph accuracy.

Install

Install the CLI as a uv tool:

uv tool install code-symbol-index

Or install from a local checkout:

uv tool install .

For local development with editable imports and tests:

uv venv .venv
uv pip install --python .venv/bin/python -e '.[dev]'

Then:

code-symbol-index --version

Quick Start

Build or refresh the index:

code-symbol-index index --root /path/to/repo

Check whether indexed tools are available:

code-symbol-index status --root /path/to/repo
code-symbol-index status --root /path/to/repo --check

Search symbols:

code-symbol-index search Tool --root /path/to/repo --limit 20
code-symbol-index search Tool Agent Runner --root /path/to/repo
code-symbol-index search Tool --root /path/to/repo --kind class,function --path src --exact-only

Inspect one symbol:

code-symbol-index inspect Tool --root /path/to/repo
code-symbol-index inspect Tool.method_name --root /path/to/repo
code-symbol-index inspect Tool --root /path/to/repo --anchors

Outline a file:

code-symbol-index outline src/app.py --root /path/to/repo
code-symbol-index outline src/app.py --root /path/to/repo --symbol Tool

CLI

code-symbol-index languages
code-symbol-index --version
code-symbol-index version
code-symbol-index index --root /path/to/repo
code-symbol-index update src/app.py src/lib.py --root /path/to/repo
code-symbol-index status --root /path/to/repo
code-symbol-index status --root /path/to/repo --check
code-symbol-index status --root /path/to/repo --check --max-pending-files 20
code-symbol-index search Tool --root /path/to/repo
code-symbol-index search Tool Agent Runner --root /path/to/repo
code-symbol-index search Tool --root /path/to/repo --kind class,function --path src --exact-only
code-symbol-index inspect Tool --root /path/to/repo
code-symbol-index inspect Tool --root /path/to/repo --path src --exact-only
code-symbol-index inspect Tool --root /path/to/repo --anchors
code-symbol-index outline src/app.py --root /path/to/repo
code-symbol-index outline src/app.py --root /path/to/repo --symbol Tool
code-symbol-index refs Tool --root /path/to/repo --limit 20 --offset 0
code-symbol-index impls Greeter --root /path/to/repo --kind trait --limit 20 --offset 0
code-symbol-index clean --root /path/to/repo
code-symbol-index install-skill

JSON is available for structured consumers:

code-symbol-index search Tool --root /path/to/repo --json
code-symbol-index inspect Tool --root /path/to/repo --json
code-symbol-index inspect Tool --root /path/to/repo --anchors --json
code-symbol-index outline src/app.py --root /path/to/repo --json
code-symbol-index refs Tool --root /path/to/repo --json
code-symbol-index impls Tool --root /path/to/repo --json
code-symbol-index status --root /path/to/repo --json

Output Formats

Search returns candidates only, never source:

query: Tool
count: 2
limit: 20
has_more: false

symbols:
  - id: python:class:Tool:nanocode.py:1284:1330
    name: Tool
    kind: class
    file: nanocode.py
    range: 1284:1330
    signature: class Tool:
    score: exact
    language: python

For multiple search queries:

queries:
  - Tool
  - Agent
count: 2
limit: 20
has_more: false

symbols:
  - id: python:class:Tool:nanocode.py:1284:1330
    name: Tool
    kind: class
    file: nanocode.py
    range: 1284:1330
    signature: class Tool:
    score: exact
    matched_query: Tool

Inspect returns bounded source with stable 0-based line ranges:

symbol:
  id: python:function:foo:src/app.py:120:123
  name: foo
  kind: function
  file: src/app.py
  range: 120:123
  signature: def foo():
summary:
  imports: 2
  members: 0
  callers: 1
  callees: 1
  references: 3
  implementors: 0
imports:
  - range: 0:1
    statement: import os
source:
  status: full
  range: 120:123
  shown_range: 120:123
  total_lines: 3

  120 |def foo():
  121 |    if ok:
  122 |        return 1

Use inspect --anchors or inspect_text(..., anchors=True) to emit hashline source anchors from the current file contents:

source:
  status: full
  range: 120:123
  shown_range: 120:123
  total_lines: 3
  note: Use line:hash as edit anchor; code starts after |

120:a1b2c3d4|def foo():
121:d4e5f6a7|    if ok:
122:f6a7b8c9|        return 1

JSON inspect with anchors=True includes source_anchor with path, start_line, end_line, start_anchor, end_anchor, and lines[{line, hash, text}]. Hashes are computed from current file contents at output time.

Outline returns file structure without source or ids:

file: nanocode.py
range: 0:9060
count: 142

outline:
1284:1330 | class Tool:
1289:1292 |     def cli_args(cls, args):
1312:1325 |     def tool_schema(cls):
9023:9060 | def main(argv=None):

Status is fast by default and does not scan the directory:

index:
  status: ready
  root: /path/to/repo
  files: 128
  symbols: 4820
  languages: python, typescript
  language_breakdown:
    - python: 80 files (62.5%)
    - typescript: 48 files (37.5%)
  pending_changes: unknown

Use --check to scan the directory and compute staleness:

index:
  status: stale
  root: /path/to/repo
  files: 128
  symbols: 4820
  pending_changes: 3
  pending_files:
    - src/app.py
    - src/new_feature.py
  reason: files changed after last index update

pending_files is bounded by --max-pending-files and is only computed with --check.

Query Rules

inspect accepts only symbol-like input:

  • ClassName
  • function_name
  • ClassName.method_name
  • symbol_prefix

It rejects natural language, file paths, and directory paths. Use outline for file paths.

search accepts A|B|C as a non-regex OR shorthand. --kind accepts one kind or comma-separated kinds, --path filters to a file or directory, and --exact-only disables prefix/fuzzy matches. The same filters are available in the Python API as kind=, path=, and exact_only=True.

Python indexes top-level constants, top-level variables, and top-level dictionary keys as symbols. Dictionary keys use kind=dict_key and the parent assignment as container.

All line ranges are start:end, 0-based, with end exclusive.

Python API

import code_symbol_index as csi

csi.index("/path/to/repo")
csi.update(["src/app.py", "src/lib.py"], root="/path/to/repo")

print(csi.status_text("/path/to/repo"))
print(csi.search_text("Tool", root="/path/to/repo"))
print(csi.search_text("Tool|Agent", root="/path/to/repo", kind="class,function", path="src"))
print(csi.inspect_text("Tool", root="/path/to/repo"))
print(csi.inspect_text("Tool", root="/path/to/repo", path="src", exact_only=True))
print(csi.inspect_text("Tool", root="/path/to/repo", anchors=True))
print(csi.outline_text("src/app.py", root="/path/to/repo"))
print(csi.outline_text("src/app.py", root="/path/to/repo", symbol="Tool"))

symbols = csi.search("Tool", root="/path/to/repo", format="object")
symbols = csi.search(["Tool", "Agent", "Runner"], root="/path/to/repo")
search_payload = csi.search("Tool", root="/path/to/repo", format="json")
search_text = csi.search("Tool", root="/path/to/repo", format="text")
inspection = csi.inspect("Tool", root="/path/to/repo")
anchored = csi.inspect("Tool", root="/path/to/repo", format="json", anchors=True)
references = csi.refs("Tool", root="/path/to/repo", limit=20, offset=0)

For repeated queries, reuse a repository handle:

repo = csi.Repository("/path/to/repo")
repo.update(["src/app.py"])
print(repo.search_text("Tool"))
print(repo.inspect_text("Tool"))
print(repo.outline_text("src/app.py"))

Refresh and update accept an optional progress callback:

def on_progress(event, *, done=0, total=0, path=None):
    print(event, done, total, path)

repo = csi.Repository("/path/to/repo", progress=on_progress)
repo.refresh()
repo.update(["src/app.py"], progress=on_progress)

Stable progress events are scan, start, file, and finish.

To refresh the index during application startup without blocking startup:

thread = csi.refresh_async("/path/to/repo", progress=on_progress)

refresh_async creates its own Repository inside the background thread. Do not share a Repository instance across threads.

Queries require an existing index. Run code-symbol-index index or code_symbol_index.index() first. Queries do not sync automatically unless called with --sync or sync=True. After external file edits, call code_symbol_index.update(paths, root=...) or Repository.update(paths) to refresh only those files; deleted or newly ignored paths are removed from the index.

Install the Codex skill:

code-symbol-index install-skill
code-symbol-index install-skill --codex-home ~/.codex --force

The command writes SKILL.md to $CODEX_HOME/skills/code-symbol-index/, or ~/.codex/skills/code-symbol-index/ when CODEX_HOME is not set.

Top-level query APIs accept format="object" | "text" | "json":

  • object returns Python dataclasses/lists and is the default.
  • text returns the same readable format as the *_text helpers.
  • json returns JSON-safe Python dict/list data.

search accepts one query, A|B|C, or a list of symbol names/prefixes. Multiple queries are OR-ed, are not regexes, and share one total limit. Search text and JSON formats include has_more when more matches exist beyond limit.

Development

make install
make check
make smoke
make clean

Python API List

Index lifecycle:

  • index(root=".", *, language=None, progress=None) -> Repository
  • update(paths, *, root=".", language=None, progress=None) -> Repository CLI: code-symbol-index update <paths...> --root <repo>
  • refresh_async(root=".", *, language=None, db_path=None, progress=None, daemon=True) -> threading.Thread
  • install_skill(*, target="codex", codex_home=None, force=False) -> Path
  • clean(root=".") -> None
  • status(root=".", *, language=None, db_path=None, check=False, max_pending_files=50, format="object") -> IndexStatus | str | dict
  • status_text(root=".", *, language=None, db_path=None, check=False, max_pending_files=50) -> str

Queries:

  • search(query: str | list[str], *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, sync=False, format="object") -> list[Symbol] | str | dict
  • search_text(query: str | list[str], *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, sync=False) -> str
  • inspect(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, anchors=False, sync=False, format="object", ...) -> Inspection | str | dict
  • inspect_text(query, *, root=".", kind=None, language=None, path=None, exact_only=False, anchors=False, sync=False, ...) -> str
  • refs(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, offset=0, sync=False, format="object") -> Page | str | dict
  • impls(query, *, root=".", kind=None, language=None, path=None, exact_only=False, limit=20, offset=0, sync=False, format="object") -> Page | str | dict
  • outline(path, *, root=".", symbol=None, max_symbols=200, sync=False, format="object") -> Page | str | dict
  • outline_text(path, *, root=".", symbol=None, max_symbols=200, sync=False) -> str

Repository handle:

  • Repository(root=".", *, languages=None, include=None, exclude=None, db_path=None)
  • Repository.refresh(*, progress=None) -> Repository
  • Repository.update(paths=None, *, progress=None) -> Repository
  • Repository.search(...), search_text(...)
  • Repository.inspect(...), inspect_text(...)
  • Repository.refs(...), impls(...)
  • Repository.outline(...), outline_text(...)
  • Repository.clean() -> None

Data classes:

  • Symbol
  • Reference
  • Page
  • Inspection
  • InspectOptions
  • IndexStatus

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

code_symbol_index-0.1.12.tar.gz (41.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

code_symbol_index-0.1.12-py3-none-any.whl (33.0 kB view details)

Uploaded Python 3

File details

Details for the file code_symbol_index-0.1.12.tar.gz.

File metadata

  • Download URL: code_symbol_index-0.1.12.tar.gz
  • Upload date:
  • Size: 41.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for code_symbol_index-0.1.12.tar.gz
Algorithm Hash digest
SHA256 d4e9afa305a8466c3066615c69bbd62792863e808c05533c7fe48e7a586065e3
MD5 15c4ea0d307b8b1337dedc5bceea6359
BLAKE2b-256 d8eae6fdf49068aea59280ca9b1dbcbe9201c30bcf5e349555374506ebcfd8cd

See more details on using hashes here.

File details

Details for the file code_symbol_index-0.1.12-py3-none-any.whl.

File metadata

  • Download URL: code_symbol_index-0.1.12-py3-none-any.whl
  • Upload date:
  • Size: 33.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for code_symbol_index-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 2f4ec6fd86686cd2f838e6ec4e8f8d275c035f6a553fa9030f1bcc4dfc5b88cd
MD5 b70834552b7313d0b864399a2c620442
BLAKE2b-256 e4584831e0c0833688ff400d4331ec028b98cc3eea6a9c622146ac180b78ecf6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page