MCP code-intelligence server for AI agents — beats CodeGraph on 6-repo head-to-head benchmark median. 63 MCP tools, 13 curated skills, TOON output, 100% local.
Project description
🌳 Tree-sitter Analyzer
The MCP code-intelligence server for AI agents — fewer tokens, fewer tool calls, 100 % local. Pre-indexed AST cache + 8 MCP tools (down from 63) + 13 curated agent skills + TOON-compressed output. ~80% less tool-definition overhead vs v1.x — the only code-intel MCP that is both rich-output (verdict + TOON) and Roo/Cursor-safe. Beats CodeGraph on 6-repo head-to-head median (−11 % cost vs CodeGraph's −4 %), with a strict CLI superset. BM25-ranked symbol search across all 8 facades — results sorted by relevance, not file path.
Competing tool count: CodeGraph ~12 · Rhizome 1 · TSA 8 (rich-output) · TSA v1.x was 63. Upgrading from v1.x? See docs/MIGRATION.md.
Get Started
One-line install for Claude Code:
claude mcp add tree-sitter-analyzer \
--env TREE_SITTER_PROJECT_ROOT="$PWD" \
-- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp
Restart your agent, then say: "Set the project root to my repo and run codegraph_status."
Other agents (Cursor, Copilot, Cline, Continue, Claude Desktop, Roo Code) →
Why Tree-sitter Analyzer
- Token-efficient by default. Every MCP response uses TOON — a tabular JSON variant that cuts payload by ~50-70 % vs raw JSON.
- Verdict envelopes. Every response carries
verdict: SAFE | CAUTION | UNSAFE | INFO | WARN | ERROR | NOT_FOUND, so orchestrators branch on outcomes without re-prompting. - Project health grading (A–F). No other open-source tool grades your whole project on size / complexity / coverage / duplication / dependencies / structure / git-hotspots in one call.
- 13 curated workflows (Skills). Pre-baked tool subsets for "find symbol", "trace call chain", "score health", "safe-to-edit before refactor", "PR review", etc.
- 5 layers of safety.
safe_to_edit+modification_guard+ constraint DSL +change_impact+ verdict envelopes — designed so agents know before they touch. - Beats the leading competitor (CodeGraph) on multiple head-to-head benchmarks. See below.
Benchmark Results
Headless Claude Code (Haiku 4.5) asked one architecture question per repo. 3 arms: no-MCP / CodeGraph MCP / Tree-sitter Analyzer MCP. Single run per arm — indicative, not statistically settled.
| Codebase | Lang / files | Baseline | CodeGraph | TSA | Winner |
|---|---|---|---|---|---|
| Gin | Go / 99 | $0.164 | $0.094 (−43 %) | $0.080 (−51 %) | TSA ⭐ |
| Alamofire | Swift / 98 | $0.201 | $0.219 (+9 %) | $0.147 (−27 %) | TSA ⭐ |
| Excalidraw | TS / 603 | $0.204 | $0.179 (−12 %) | $0.212 (+4 %) | CodeGraph |
| Django | Py / 2 910 | $0.162 | $0.106 (−35 %) | $0.205 (+27 %) | CodeGraph |
| Tokio | Rust / 778 | $0.214 | $0.285 (+33 %) | $0.303 (+42 %) | both lose |
| OkHttp | Java / 596 | $0.169 | $0.200 (+18 %) | $0.178 (+5 %) | both lose |
| Median Δ vs baseline | −4 % | −11 % | TSA |
TSA wins outright on 2 of 6 repos, has a lower median cost saving (−11 %), and matches CodeGraph's reported direction on every repo where the indexer-class tools should help.
Why the median diverges from CodeGraph's published −35 % claim: we used Haiku for cost control; they used Opus + 4-run median. See
docs/internal/CODEGRAPH_BENCHMARK_FINAL_2026-05-24.mdfor raw envelopes + reproducer scripts.Post-benchmark improvements (2026-05-30): (1) BM25 pre-filter narrows 40k symbols to ~400 before cosine rerank — a 133× speedup in semantic search. (2) Min-max BM25 normalization: relevance_score now properly differentiates strong matches (1.0) from weak (0.0) across all search paths. (3)
semantic().sort(by='confidence')now works end-to-end. These improvements were not in the benchmark run; repos with large symbol counts (Django, Excalidraw) should see improved token efficiency in re-runs.
Key Features
Pre-indexed code intelligence (CodeGraph parity + superset)
| Capability | TSA tool | Status |
|---|---|---|
| Symbol search (FTS5 + BM25 ranked) | codegraph_symbol_search |
ahead — results sorted by relevance score, not file path |
| Go-to-def / find-refs / call hierarchy in one call | codegraph_navigate |
PRIMARY entry point |
| Bulk-fetch N related symbols + relationship map | codegraph_explore |
parity |
| Function-level blast radius + risk score | codegraph_impact |
parity + risk score |
| Who-calls-X / what-X-calls | codegraph_callers / codegraph_callees |
parity |
| Index health at-a-glance (+ edge count) | codegraph_status |
ahead — reports total_edges for graph density signal |
| Pre-built call graph cache | codegraph_autoindex / codegraph_full_index / codegraph_incremental_sync |
parity |
| Tests affected by a change (CLI) | --affected FILE... |
parity |
Tree-sitter Analyzer exclusive
| Capability | TSA tool | Note |
|---|---|---|
| BM25-ranked symbol search | all search tools | relevance_score on every result (min-max normalized: best=1.0, weakest=0.0); sort(by='confidence') in DSL |
| Semantic search (133× faster) | codegraph_query semantic() |
BM25 pre-filter narrows 40k symbols to ~400 before cosine rerank |
| Project A–F health grading | check_project_health |
7 dimensions (size/complexity/deps/coverage/duplication/structure/git-hotspot), no competitor offers this |
| TOON output | every tool, output_format: "toon" (default) |
50-70 % token saving |
| Verdict envelopes | every tool | SAFE/CAUTION/UNSAFE/INFO/WARN/ERROR/NOT_FOUND |
| Safe-to-edit gate | safe_to_edit + modification_guard |
refuses high-risk edits before they happen |
| Architectural constraint DSL | check_constraints |
"module A cannot import B" → enforced |
| Code health (file-level) | check_file_health |
block/long-method/smell detection |
| Class hierarchy | codegraph_class_hierarchy |
type-inheritance tree |
| Dependency matrix | codegraph_dependency_matrix |
module-coupling matrix |
| Dead code | codegraph_dead_code |
transitive unreachable analysis |
| Complexity heatmap | codegraph_complexity_heatmap |
per-fn cyclomatic + project view |
| AST-structural clone detection | codegraph_similarity |
beyond text similarity |
| Mermaid call-graph export | codegraph_visualize |
paste-ready in docs |
| UML Mermaid export | codegraph_uml |
class / package / component / sequence diagrams |
| PR review | codegraph_pr_review |
AST-diff + semantic classify + blast radius |
| agent_summary | every response | next-step hint baked into the envelope |
| Synapse cross-file resolver | internal | import-aware, beats regex guessing |
| Temporal activation | symbol_lineage |
per-symbol git-modification frequency |
| One-shot file orientation | smart_context |
health + exports + deps + edit-risk in one call (replaces 3-4 calls) |
| Architectural decision journal | decision_journal |
persists reasoning across sessions — no competitor exposes this |
Skills (13 curated workflows)
CodeGraph has zero skills. We ship 13 under .claude/skills/tsa-*/:
tsa-landing, tsa-find, tsa-graph, tsa-structure, tsa-deps, tsa-index, tsa-health-watch, tsa-edit-safety, tsa-edit-then-verify, tsa-constraints, tsa-pr-review, tsa-refactor-queue, tsa-temporal.
Each skill ships an allowed-tools subset + procedure recipe + decision-surface schema, so the agent doesn't have to triage 8 tools on every question.
263 CLI flags
Strict superset of CodeGraph's 15-command CLI. Highlights:
tree-sitter-analyzer --table full <file> # method/signature/complexity table
tree-sitter-analyzer --partial-read --start-line N --end-line M <file>
tree-sitter-analyzer --project-health # A-F grade across the project
tree-sitter-analyzer --callers <symbol> # who-calls
tree-sitter-analyzer --codegraph-impact <fn> # blast radius + risk
tree-sitter-analyzer --affected <file...> # tests transitively affected
tree-sitter-analyzer --dead-code # transitive unreachable
tree-sitter-analyzer --check-constraints # architectural rules
tree-sitter-analyzer --safe-to-edit <file> # refuse if risky
tree-sitter-analyzer --uml class # Mermaid UML class diagram
See docs/CODEMAPS/cli.md for the full surface.
Quick Start
1. Install dependencies
# uv (required)
curl -LsSf https://astral.sh/uv/install.sh | sh # macOS / Linux
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" # Windows
# fd + ripgrep (required for search)
brew install fd ripgrep # macOS
winget install sharkdp.fd BurntSushi.ripgrep.MSVC # Windows
2. Install Tree-sitter Analyzer
uv add "tree-sitter-analyzer[all,mcp]"
3. Hook it into your agent
See Supported Agents. Most clients want this MCP server entry:
{
"mcpServers": {
"tree-sitter-analyzer": {
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
}
}
}
After restart: "Set the project root to my repo and call codegraph_status."
How It Works
Source code → tree-sitter parse → SQLite + FTS5 index (.ast-cache/index.db)
↓
codegraph_navigate / codegraph_explore / codegraph_callers / ...
↓
TOON-compressed envelope
(verdict + agent_summary + data)
↓
MCP client / CLI consumer
The index is built lazily on first query, refreshed on file change via a content-hash diff (codegraph_incremental_sync). All 8 tools read from the same .ast-cache/, so a query and its follow-up share work.
Supported Agents
📘 Claude Code (recommended)
claude mcp add tree-sitter-analyzer \
--env TREE_SITTER_PROJECT_ROOT="$PWD" \
-- uvx --from "tree-sitter-analyzer[mcp]" tree-sitter-analyzer-mcp
Verify: claude mcp list. The 13 tsa-* skills auto-discover from .claude/skills/.
📗 Claude Desktop
Edit claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/, Windows: %APPDATA%\Claude\, Linux: ~/.config/Claude/):
{
"mcpServers": {
"tree-sitter-analyzer": {
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "/absolute/path/to/your/project" }
}
}
}
📙 GitHub Copilot (VS Code)
Create .vscode/mcp.json (note: servers, not mcpServers):
{
"servers": {
"tree-sitter-analyzer": {
"type": "stdio",
"command": "uvx",
"args": ["--from", "tree-sitter-analyzer[mcp]", "tree-sitter-analyzer-mcp"],
"env": { "TREE_SITTER_PROJECT_ROOT": "${workspaceFolder}" }
}
}
}
🖱 Cursor / Cline / Continue / Roo Code
All read the same mcpServers schema as Claude Desktop. Cursor: Settings → MCP. Cline: MCP panel → Edit settings. Continue: ~/.continue/config.json under experimental.modelContextProtocolServers. Roo Code: MCP panel → Edit MCP Settings.
⚠️
TREE_SITTER_PROJECT_ROOTmust be absolute. The server enforces a security boundary against escapes viaSecurityBoundaryManager.
Supported Languages
21 language plugins; 13 fully wired into the indexer (full symbol + call graph) + 5 (data/markup) reachable via the single-file CLI path + 3 scaffold (plugin exists, indexer wiring pending). The 2026-05-24 patch unblocked Swift / Kotlin / Ruby / PHP / C# that had been silently skipped for months.
| Tier | Languages |
|---|---|
| Full index + symbol + call graph | Python · Java · JavaScript · TypeScript · Go · Rust · C · C++ · C# · Swift · Kotlin · Ruby · PHP |
| Single-file analysis (CLI) | HTML · CSS · Markdown · SQL · YAML |
| Scaffold (plugin exists, indexer wiring pending) | bash · scala · json |
CodeGraph supports a similar set; the only popular code languages neither tool ships yet are Dart, Vue, Svelte, Lua (next-sprint backlog).
Configuration
Mostly nothing. The defaults are designed so you can hook it into your agent and forget:
- Output format: TOON. Override per-call with
output_format: "json". - Project root:
TREE_SITTER_PROJECT_ROOT(env var, MCP) or--project-root(CLI). - Cache location:
<project>/.ast-cache/. Safe to delete — auto-rebuilds. - Optional:
TREE_SITTER_OUTPUT_PATHfor large-output write target.
Quality & Testing
| Metric | Value |
|---|---|
| Tests passed | 17,456 ✅ |
| Coverage | |
| Type safety | 100 % mypy |
| Platforms | macOS · Linux · Windows |
| Pre-commit gates | bandit · mypy · pyupgrade · detect-secrets · codemap-sync · smell-ratchet |
uv run pytest -q # full suite
uv run python check_quality.py --new-code-only # quality gate
Troubleshooting
| Symptom | Fix |
|---|---|
unsupported language on .swift / .kt / .rb / .php / .cs |
Update to ≥ 1.12.x — the 5-language gap was patched in commit 50e99a8f. |
| MCP server doesn't appear in client | TREE_SITTER_PROJECT_ROOT must be absolute; restart the client after config edit. |
database is locked |
Stop any other process holding .ast-cache/index.db; if persistent, rm -rf .ast-cache && tree-sitter-analyzer --autoindex. |
| Slow first call | First call builds the index. Subsequent calls are sub-second. Run --full-index upfront to amortise. |
| Agent picks the wrong tool | Use a tsa-* skill (/tsa-graph, /tsa-find, ...) — each skill restricts the visible tool set to one workflow. |
Development
git clone https://github.com/aimasteracc/tree-sitter-analyzer.git
cd tree-sitter-analyzer
uv sync --extra all --extra mcp
uv run pytest -q
See docs/CONTRIBUTING.md for the development guide.
Contributing & License
- ⭐ A GitHub star helps surface this tool to other AI-agent users.
- 💖 Sponsor — supports continued MCP / Skills development.
- Lead sponsor: @o93.
- MIT licensed — see LICENSE.
- Release history: CHANGELOG.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tree_sitter_analyzer-1.20.0.tar.gz.
File metadata
- Download URL: tree_sitter_analyzer-1.20.0.tar.gz
- Upload date:
- Size: 2.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8065cd26bae9679f5cda43a8b800e8eb963957a852621c9cf6d93447e23476c6
|
|
| MD5 |
fb05366f95191dcb3effd76f27f47016
|
|
| BLAKE2b-256 |
fae93319c3c9859523b81079649801618fe744e0cbe2a3bd3e502828ec715165
|
File details
Details for the file tree_sitter_analyzer-1.20.0-py3-none-any.whl.
File metadata
- Download URL: tree_sitter_analyzer-1.20.0-py3-none-any.whl
- Upload date:
- Size: 1.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b746d7f1d4169aa9a50449ef66f66dc1db5f31de1786a8625f93f12fd11f459d
|
|
| MD5 |
44430495ea9ba4f078b5e9699e2e3718
|
|
| BLAKE2b-256 |
b94406eb599250fac00fd554dc9a305931cdf8578f7c441e8aea0be057ba32cd
|