Skip to main content

Semantic tool routing for MCP. Stop injecting all schemas. Inject only what matters.

Project description

tool-rot

tool-rot logo

Semantic tool routing for MCP. Keep every tool callable. Stop sending every schema.

tool-rot is a transparent MCP proxy for teams running many MCP servers. It indexes all upstream tools locally, returns only the most relevant tool schemas on each turn, and forwards tool calls to the original server unchanged.

$ tool-rot bench --k 3

  Queries tested       100
  K (tools injected)   3
  Recall@K             91.3%   (correct tool was in top-3)
  Avg token reduction  94.6%
  Avg routing latency  8.7ms
  Index build time     420ms

Why This Exists

MCP tool schemas are useful context, but they get expensive fast. A few servers can add thousands of schema tokens before the model sees the user's actual request. Most of those tools are irrelevant on any given turn.

That creates three problems:

  • Cost: repeated schema tokens are paid for every turn.
  • Attention: the model has to scan unused tools before solving the task.
  • Accuracy: large tool menus make wrong tool selection more likely.

tool-rot treats tool schemas like a retrieval problem. The full toolset stays available behind the proxy, while the model only sees the schemas that are likely to matter right now.

What It Guarantees

  • Execution is never blocked by routing. Filtering only affects tools/list; tools/call still forwards to the upstream server.
  • Tool names are namespaced. Exposed tools use MCP-compliant server.tool names, so duplicate names across servers do not collide.
  • Resources and prompts pass through. Tool schemas are routed; resources and prompts are aggregated and forwarded without semantic filtering.
  • Routing is local. MiniLM ONNX embeddings and FAISS run locally; no routing API key is required.
  • Schemas are slimmed. Nonessential display metadata is removed before indexing and injection.
  • Misses are observable. Tool-call hits and misses are logged, and status can recommend a higher k.

Install

pip install tool-rot

For SSE transport support:

pip install "tool-rot[sse]"

Quick Start

One-line Cursor integration:

pip install tool-rot && tool-rot init cursor --filesystem .

This creates tool-rot.toml and .cursor/mcp.json with local filesystem MCP routing enabled.

Create tool-rot.toml:

[proxy]
transport     = "stdio"
k             = 3
max_k         = 8
auto_tune_k   = false
always_inject = ["filesystem.read_file"]

[routing]
embedder     = "all-MiniLM-L6-v2"
index_type   = "flat"
ranking_mode = "hybrid"
cache_dir    = ".tool-rot"

[logging]
session_log = ".tool-rot/session.jsonl"
verbose     = false

[[server]]
name    = "filesystem"
command = "npx"
args    = ["-y", "@modelcontextprotocol/server-filesystem", "."]

[[server]]
name    = "github"
command = "npx"
args    = ["-y", "@modelcontextprotocol/server-github"]
env     = { GITHUB_TOKEN = "${GITHUB_TOKEN}" }

Generate client config:

# Claude Code
tool-rot cc-snippet

# Cursor
tool-rot cursor-snippet

Then start using your MCP client normally. The client connects to tool-rot; tool-rot connects to your real MCP servers.

Architecture

MCP client
  |
  | tools/list, tools/call
  v
tool-rot proxy
  |
  | list_tools / call_tool
  +--> filesystem MCP
  +--> github MCP
  +--> slack MCP
  +--> ...

Routing sidecar:
  tool schemas -> schema slimming -> embeddings -> FAISS index
  user query   -> hybrid semantic + lexical retrieval -> top-K schemas

Turn flow:

  1. A client asks for tools/list.
  2. tool-rot routes against the current query context.
  3. It returns the top-K namespaced schemas plus always_inject tools.
  4. If the model calls a tool, tool-rot forwards the call to the matching upstream server.
  5. Hits, misses, token estimates, and latency are logged to .tool-rot/session.jsonl.

Claude Code can provide per-turn query context through the packaged tool-rot-hook. Other hosts can POST {"query": "..."} to http://127.0.0.1:4748/query before tools/list. Without query context, tool-rot falls back to default routing.

CLI

# Start the proxy
tool-rot serve --config tool-rot.toml

# Show session stats and K recommendations
tool-rot status
tool-rot status --json

# Benchmark Recall@K and token reduction
tool-rot bench --k 3
tool-rot bench --tools tools.json --queries queries.jsonl --k 3 --json

# Compare routing against all-tools and compression baselines
tool-rot eval-report --k 3
tool-rot eval-report --k 3 --output eval.json

# Regenerate real open-source MCP savings reports
tool-rot mcp-smoke-report --output-dir reports

# Inspect or clear the local index
tool-rot index show
tool-rot index show --server github
tool-rot index rebuild

# Generate client snippets
tool-rot cc-snippet
tool-rot cursor-snippet

See docs/CLIENT_SETUP.md for exact Cursor and Claude Code configuration examples.

Benchmark And Eval

bench answers: "If I inject K tools, how often is the correct tool included?"

tool-rot bench --k 3

Custom eval files:

[
  {
    "server_name": "github",
    "name": "create_pull_request",
    "description": "Create a pull request in a GitHub repository.",
    "inputSchema": {
      "type": "object",
      "properties": {
        "repo": { "type": "string" },
        "title": { "type": "string" }
      }
    }
  }
]
{"query":"open a pull request for this branch","correct_tool":"github.create_pull_request"}

eval-report compares:

  • all_tools: sends every schema, 100% tool visibility, no token reduction.
  • compression: slims every schema, still sends all tools.
  • semantic: vector retrieval only.
  • hybrid: vector retrieval plus lexical/BM25-like scoring.

Use this before rolling out a large MCP setup. Real tool names, descriptions, and query patterns matter.

Real MCP Results

These results were measured against official open-source MCP servers from npm. Token counts use len(minified_json) // 4, so treat them as comparable estimates rather than provider billing numbers.

MCP setup Direct tools tool-rot tools Tokens saved Reduction
Filesystem 14 3 2,094 79.2%
Memory 9 2 1,736 75.5%
Sequential Thinking 1 1 10 0.9%
Everything 13 2 1,267 87.3%
Combined official stack 37 7 5,436 71.6%

Full reports live in reports/. The single-tool Sequential Thinking server is a useful negative control: routing cannot save much when a server only exposes one tool.

To reproduce these results, run:

tool-rot mcp-smoke-report --output-dir reports

See docs/REPRODUCING_RESULTS.md for details.

Configuration Reference

[proxy]
transport     = "stdio"          # stdio | sse
port          = 4747             # SSE listener port
context_api_port = 4748          # Local query-context API
context_api_token = ""           # Optional bearer token; env TOOL_ROT_CONTEXT_TOKEN also works
k             = 3                # Initial tools injected per turn
max_k         = 8                # Upper bound for auto-tuned K
auto_tune_k   = false            # Increase effective K after routing misses
always_inject = ["server.tool"] # Tools always returned by tools/list

[routing]
embedder     = "all-MiniLM-L6-v2" # all-MiniLM-L6-v2 | tfidf
index_type   = "flat"             # flat | hnsw
ranking_mode = "hybrid"           # hybrid | semantic
cache_dir    = ".tool-rot"

[logging]
session_log = ".tool-rot/session.jsonl"
verbose     = false
log_query_preview = false        # Avoid logging prompt text by default

[[server]]
name    = "local-server"
command = "npx"
args    = ["-y", "@modelcontextprotocol/server-name"]
env     = { API_KEY = "${API_KEY}" }

[[server]]
name = "remote-server"
url  = "https://example.com/sse"

Production Notes

  • Use namespaced always_inject values such as filesystem.read_file.
  • Keep tool descriptions short but specific; routing quality depends on schema quality.
  • Start with k = 3, run bench, then adjust based on Recall@K and token savings.
  • Enable auto_tune_k when correctness matters more than maximum token savings.
  • Watch .tool-rot/session.jsonl or tool-rot status for routing misses.
  • Upstream notifications/tools/list_changed events trigger a tool refresh and index rebuild.
  • See docs/ROUTING_GUIDE.md for recommended k values by tool count.

Client Setup

Claude Code

tool-rot cc-snippet

Add the generated MCP config and hook config to .claude/settings.json. The hook sends the latest user query to the local context API so routing is accurate from the first turn.

Cursor

tool-rot cursor-snippet

Add the generated JSON to .cursor/mcp.json or your user-level Cursor MCP config. Cursor can use tool-rot as a normal MCP server over stdio or sse.

Development

python3.12 -m venv .venv
.venv/bin/python -m pip install -e ".[dev,sse]"
.venv/bin/python -m pytest tests -q

Focused checks:

.venv/bin/python -m pytest tests/test_prod_hardening.py tests/test_p0_p1.py -q
.venv/bin/python -m compileall tool_rot tests -q

Release steps are documented in docs/RELEASE_CHECKLIST.md.

Status

tool-rot is intended for local and team MCP workflows where large tool menus are creating measurable context overhead. It is designed to be conservative: route schemas aggressively, but keep execution passthrough intact.

Known limits:

  • Query context is best when the host can send the latest user message before tools/list.
  • Resource and prompt passthrough is aggregation-only; semantic routing is currently tools-only.
  • HNSW is available as a config option, but you should benchmark it on your own tool corpus before using it for large deployments.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tool_rot-0.1.0.tar.gz (58.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tool_rot-0.1.0-py3-none-any.whl (42.7 kB view details)

Uploaded Python 3

File details

Details for the file tool_rot-0.1.0.tar.gz.

File metadata

  • Download URL: tool_rot-0.1.0.tar.gz
  • Upload date:
  • Size: 58.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for tool_rot-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c7a04da16afc89bce501b004dacb618a26f9f395a7b7ec9629596661bda01c0e
MD5 fb10184f955c4e3a26396ff727d6f8ac
BLAKE2b-256 0762f582cd53047dd593714545135a808fe7a458fbbbea9b5bcdbd66b8edc350

See more details on using hashes here.

File details

Details for the file tool_rot-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: tool_rot-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 42.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for tool_rot-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2b70791b0dcbdd5df8285848ef0f616150f16713f10713b5d835d9f9ed9b9633
MD5 5c729db19abe7f3c3cab54448bd6c7a6
BLAKE2b-256 c7e472af2a538164bc9e2b457efeb68e3689e5607a51f0a67355667d40078319

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page