Skip to main content

MCP code-intelligence server for AI agents — beats CodeGraph on 6-repo head-to-head benchmark median. 63 MCP tools, 13 curated skills, TOON output, 100% local.

Project description

🌳 Tree-sitter Analyzer

English | 日本語 | 简体中文

The MCP code-intelligence server for AI agents — fewer tokens, fewer tool calls, 100 % local. Pre-indexed AST cache + 63 MCP tools + 13 curated agent skills + TOON-compressed output. Beats CodeGraph on 6-repo head-to-head median (−11 % cost vs CodeGraph's −4 %), with a strict CLI superset. Now with BM25-ranked symbol search across all 63 tools — results sorted by relevance, not file path.

PyPI Python Version License Tests Coverage GitHub Stars


Get Started

One-line install for Claude Code:

claude mcp add tree-sitter-analyzer \
  --env TREE_SITTER_PROJECT_ROOT="$PWD" \
  -- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp

Restart your agent, then say: "Set the project root to my repo and run codegraph_status."

Other agents (Cursor, Copilot, Cline, Continue, Claude Desktop, Roo Code) →


Why Tree-sitter Analyzer

  • Token-efficient by default. Every MCP response uses TOON — a tabular JSON variant that cuts payload by ~50-70 % vs raw JSON.
  • Verdict envelopes. Every response carries verdict: SAFE | CAUTION | UNSAFE | INFO | WARN | ERROR | NOT_FOUND, so orchestrators branch on outcomes without re-prompting.
  • Project health grading (A–F). No other open-source tool grades your whole project on size / complexity / coverage / duplication / dependencies / structure / git-hotspots in one call.
  • 13 curated workflows (Skills). Pre-baked tool subsets for "find symbol", "trace call chain", "score health", "safe-to-edit before refactor", "PR review", etc.
  • 5 layers of safety. safe_to_edit + modification_guard + constraint DSL + change_impact + verdict envelopes — designed so agents know before they touch.
  • Beats the leading competitor (CodeGraph) on multiple head-to-head benchmarks. See below.

Benchmark Results

Headless Claude Code (Haiku 4.5) asked one architecture question per repo. 3 arms: no-MCP / CodeGraph MCP / Tree-sitter Analyzer MCP. Single run per arm — indicative, not statistically settled.

Codebase Lang / files Baseline CodeGraph TSA Winner
Gin Go / 99 $0.164 $0.094 (−43 %) $0.080 (−51 %) TSA
Alamofire Swift / 98 $0.201 $0.219 (+9 %) $0.147 (−27 %) TSA
Excalidraw TS / 603 $0.204 $0.179 (−12 %) $0.212 (+4 %) CodeGraph
Django Py / 2 910 $0.162 $0.106 (−35 %) $0.205 (+27 %) CodeGraph
Tokio Rust / 778 $0.214 $0.285 (+33 %) $0.303 (+42 %) both lose
OkHttp Java / 596 $0.169 $0.200 (+18 %) $0.178 (+5 %) both lose
Median Δ vs baseline −4 % −11 % TSA

TSA wins outright on 2 of 6 repos, has a lower median cost saving (−11 %), and matches CodeGraph's reported direction on every repo where the indexer-class tools should help.

Why the median diverges from CodeGraph's published −35 % claim: we used Haiku for cost control; they used Opus + 4-run median. See docs/internal/CODEGRAPH_BENCHMARK_FINAL_2026-05-24.md for raw envelopes + reproducer scripts.

Post-benchmark improvements (2026-05-30): (1) BM25 pre-filter narrows 40k symbols to ~400 before cosine rerank — a 133× speedup in semantic search. (2) Min-max BM25 normalization: relevance_score now properly differentiates strong matches (1.0) from weak (0.0) across all search paths. (3) semantic().sort(by='confidence') now works end-to-end. These improvements were not in the benchmark run; repos with large symbol counts (Django, Excalidraw) should see improved token efficiency in re-runs.


Key Features

Pre-indexed code intelligence (CodeGraph parity + superset)

Capability TSA tool Status
Symbol search (FTS5 + BM25 ranked) codegraph_symbol_search ahead — results sorted by relevance score, not file path
Go-to-def / find-refs / call hierarchy in one call codegraph_navigate PRIMARY entry point
Bulk-fetch N related symbols + relationship map codegraph_explore parity
Function-level blast radius + risk score codegraph_impact parity + risk score
Who-calls-X / what-X-calls codegraph_callers / codegraph_callees parity
Index health at-a-glance (+ edge count) codegraph_status ahead — reports total_edges for graph density signal
Pre-built call graph cache codegraph_autoindex / codegraph_full_index / codegraph_incremental_sync parity
Tests affected by a change (CLI) --affected FILE... parity

Tree-sitter Analyzer exclusive

Capability TSA tool Note
BM25-ranked symbol search all search tools relevance_score on every result (min-max normalized: best=1.0, weakest=0.0); sort(by='confidence') in DSL
Semantic search (133× faster) codegraph_query semantic() BM25 pre-filter narrows 40k symbols to ~400 before cosine rerank
Project A–F health grading check_project_health 7 dimensions (size/complexity/deps/coverage/duplication/structure/git-hotspot), no competitor offers this
TOON output every tool, output_format: "toon" (default) 50-70 % token saving
Verdict envelopes every tool SAFE/CAUTION/UNSAFE/INFO/WARN/ERROR/NOT_FOUND
Safe-to-edit gate safe_to_edit + modification_guard refuses high-risk edits before they happen
Architectural constraint DSL check_constraints "module A cannot import B" → enforced
Code health (file-level) check_file_health block/long-method/smell detection
Class hierarchy codegraph_class_hierarchy type-inheritance tree
Dependency matrix codegraph_dependency_matrix module-coupling matrix
Dead code codegraph_dead_code transitive unreachable analysis
Complexity heatmap codegraph_complexity_heatmap per-fn cyclomatic + project view
AST-structural clone detection codegraph_similarity beyond text similarity
Mermaid call-graph export codegraph_visualize paste-ready in docs
UML Mermaid export codegraph_uml class / package / component / sequence diagrams
PR review codegraph_pr_review AST-diff + semantic classify + blast radius
agent_summary every response next-step hint baked into the envelope
Synapse cross-file resolver internal import-aware, beats regex guessing
Temporal activation symbol_lineage per-symbol git-modification frequency
One-shot file orientation smart_context health + exports + deps + edit-risk in one call (replaces 3-4 calls)
Architectural decision journal decision_journal persists reasoning across sessions — no competitor exposes this

Skills (13 curated workflows)

CodeGraph has zero skills. We ship 13 under .claude/skills/tsa-*/:

tsa-landing, tsa-find, tsa-graph, tsa-structure, tsa-deps, tsa-index, tsa-health-watch, tsa-edit-safety, tsa-edit-then-verify, tsa-constraints, tsa-pr-review, tsa-refactor-queue, tsa-temporal.

Each skill ships an allowed-tools subset + procedure recipe + decision-surface schema, so the agent doesn't have to triage 63 tools on every question.

258 CLI flags

Strict superset of CodeGraph's 15-command CLI. Highlights:

tree-sitter-analyzer --table full <file>          # method/signature/complexity table
tree-sitter-analyzer --partial-read --start-line N --end-line M <file>
tree-sitter-analyzer --project-health             # A-F grade across the project
tree-sitter-analyzer --callers <symbol>           # who-calls
tree-sitter-analyzer --codegraph-impact <fn>      # blast radius + risk
tree-sitter-analyzer --affected <file...>         # tests transitively affected
tree-sitter-analyzer --dead-code                  # transitive unreachable
tree-sitter-analyzer --check-constraints          # architectural rules
tree-sitter-analyzer --safe-to-edit <file>        # refuse if risky
tree-sitter-analyzer --uml class                  # Mermaid UML class diagram

See docs/CODEMAPS/cli.md for the full surface.


Quick Start

1. Install dependencies

# uv (required)
curl -LsSf https://astral.sh/uv/install.sh | sh        # macOS / Linux
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"  # Windows

# fd + ripgrep (required for search)
brew install fd ripgrep                                # macOS
winget install sharkdp.fd BurntSushi.ripgrep.MSVC      # Windows

2. Install Tree-sitter Analyzer

uv add "tree-sitter-analyzer[all,mcp]"

3. Hook it into your agent

See Supported Agents. Most clients want this MCP server entry:

{
  "mcpServers": {
    "tree-sitter-analyzer": {
      "command": "uvx",
      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
      "env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
    }
  }
}

After restart: "Set the project root to my repo and call codegraph_status."


How It Works

Source code → tree-sitter parse → SQLite + FTS5 index (.ast-cache/index.db)
                                         ↓
        codegraph_navigate / codegraph_explore / codegraph_callers / ...
                                         ↓
                            TOON-compressed envelope
                            (verdict + agent_summary + data)
                                         ↓
                              MCP client / CLI consumer

The index is built lazily on first query, refreshed on file change via a content-hash diff (codegraph_incremental_sync). All 63 tools read from the same .ast-cache/, so a query and its follow-up share work.


Supported Agents

📘 Claude Code (recommended)
claude mcp add tree-sitter-analyzer \
  --env TREE_SITTER_PROJECT_ROOT="$PWD" \
  -- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp

Verify: claude mcp list. The 13 tsa-* skills auto-discover from .claude/skills/.

📗 Claude Desktop

Edit claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/, Windows: %APPDATA%\Claude\, Linux: ~/.config/Claude/):

{
  "mcpServers": {
    "tree-sitter-analyzer": {
      "command": "uvx",
      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
      "env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
    }
  }
}
📙 GitHub Copilot (VS Code)

Create .vscode/mcp.json (note: servers, not mcpServers):

{
  "servers": {
    "tree-sitter-analyzer": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
      "env": { "TREE_SITTER_PROJECT_ROOT": "${workspaceFolder}" }
    }
  }
}
🖱 Cursor / Cline / Continue / Roo Code

All read the same mcpServers schema as Claude Desktop. Cursor: Settings → MCP. Cline: MCP panel → Edit settings. Continue: ~/.continue/config.json under experimental.modelContextProtocolServers. Roo Code: MCP panel → Edit MCP Settings.

⚠️ TREE_SITTER_PROJECT_ROOT must be absolute. The server enforces a security boundary against escapes via SecurityBoundaryManager.


Supported Languages

21 language plugins; 13 fully wired into the indexer (full symbol + call graph) + 5 (data/markup) reachable via the single-file CLI path + 3 scaffold (plugin exists, indexer wiring pending). The 2026-05-24 patch unblocked Swift / Kotlin / Ruby / PHP / C# that had been silently skipped for months.

Tier Languages
Full index + symbol + call graph Python · Java · JavaScript · TypeScript · Go · Rust · C · C++ · C# · Swift · Kotlin · Ruby · PHP
Single-file analysis (CLI) HTML · CSS · Markdown · SQL · YAML
Scaffold (plugin exists, indexer wiring pending) bash · scala · json

CodeGraph supports a similar set; the only popular code languages neither tool ships yet are Dart, Vue, Svelte, Lua (next-sprint backlog).


Configuration

Mostly nothing. The defaults are designed so you can hook it into your agent and forget:

  • Output format: TOON. Override per-call with output_format: "json".
  • Project root: TREE_SITTER_PROJECT_ROOT (env var, MCP) or --project-root (CLI).
  • Cache location: <project>/.ast-cache/. Safe to delete — auto-rebuilds.
  • Optional: TREE_SITTER_OUTPUT_PATH for large-output write target.

Quality & Testing

Metric Value
Tests passed 17,456 ✅
Coverage Coverage
Type safety 100 % mypy
Platforms macOS · Linux · Windows
Pre-commit gates bandit · mypy · pyupgrade · detect-secrets · codemap-sync · smell-ratchet
uv run pytest -q                                # full suite
uv run python check_quality.py --new-code-only  # quality gate

Troubleshooting

Symptom Fix
unsupported language on .swift / .kt / .rb / .php / .cs Update to ≥ 1.12.x — the 5-language gap was patched in commit 50e99a8f.
MCP server doesn't appear in client TREE_SITTER_PROJECT_ROOT must be absolute; restart the client after config edit.
database is locked Stop any other process holding .ast-cache/index.db; if persistent, rm -rf .ast-cache && tree-sitter-analyzer --autoindex.
Slow first call First call builds the index. Subsequent calls are sub-second. Run --full-index upfront to amortise.
Agent picks the wrong tool Use a tsa-* skill (/tsa-graph, /tsa-find, ...) — each skill restricts the visible tool set to one workflow.

Development

git clone https://github.com/aimasteracc/tree-sitter-analyzer.git
cd tree-sitter-analyzer
uv sync --extra all --extra mcp
uv run pytest -q

See docs/CONTRIBUTING.md for the development guide.


Contributing & License

  • ⭐ A GitHub star helps surface this tool to other AI-agent users.
  • 💖 Sponsor — supports continued MCP / Skills development.
  • Lead sponsor: @o93.
  • MIT licensed — see LICENSE.
  • Release history: CHANGELOG.md.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tree_sitter_analyzer-1.19.0.tar.gz (2.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tree_sitter_analyzer-1.19.0-py3-none-any.whl (1.6 MB view details)

Uploaded Python 3

File details

Details for the file tree_sitter_analyzer-1.19.0.tar.gz.

File metadata

  • Download URL: tree_sitter_analyzer-1.19.0.tar.gz
  • Upload date:
  • Size: 2.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for tree_sitter_analyzer-1.19.0.tar.gz
Algorithm Hash digest
SHA256 5dc914479b71d9825ad5e1ab4e0c669b472dc32298da6db62d732711955d6d5b
MD5 86bd26c8cbd073c7c6b8de972f8f1752
BLAKE2b-256 e86f4a2e00f4e2333963ce068b707f24402f3e516e77cb0251d2d69e449c9450

See more details on using hashes here.

File details

Details for the file tree_sitter_analyzer-1.19.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tree_sitter_analyzer-1.19.0-py3-none-any.whl
Algorithm Hash digest
SHA256 cc53f27b73047641dd842f99db4d5c4f95516757d42786642c98b4806330dbad
MD5 ac33407090a87ff545bec2de4c840cb8
BLAKE2b-256 a0f6237f37635b4bcd480ec3850450a9406eb0549004490bedba1941e6d67e38

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page