Runtime semantic discovery for agent tools.
Project description
Semantic Tool Router
Dynamic runtime tool discovery and retrieval-augmented routing for AI agents.
Semantic Tool Router is a dependency-light library designed to manage the "Many-Tool" problem in LLM and Agentic workflows. Instead of exposing every available tool or Model Context Protocol (MCP) server schema to a model context window (which increases costs and degrades accuracy), it embeds tools based on their descriptions and dynamically retrieves a focused candidate set ($top-k$) for the current task.
When to use this
| Use Semantic Tool Router when… | Skip it when… |
|---|---|
| You have 20+ tools or multiple MCP servers | You have fewer than ~10 tools — pass them all |
| Prompt token cost or context limits matter | You need guaranteed correctness without retrieval risk |
| You want measurable routing before trusting an agent | Every tool must always be visible to the model |
You need permission-aware filtering (read, write, destructive) |
Tool schemas are identical and interchangeable |
This is a preprocessing layer for LangChain, LlamaIndex, or custom agent loops — not another orchestration framework.
How It Works
graph LR
Query[Task Query] --> Router(Tool Router)
Registry[Tool Registry] --> Router
Router --> Filters{Filters}
Filters --> LLM[LLM Context]
- Tool Indexing: Tool descriptions, schemas, tags, examples, and permissions are compiled into search strings and vectorized.
- Semantic Matching: The user query is embedded and compared against the indexed tools using cosine similarity.
- Metadata Filtering: Results are filtered by permission layers (e.g. read-only vs destructive commands) or specific tags.
- Context Injection: Only the top $k$ relevant tool schemas are injected into the LLM system prompt, preserving context tokens.
Features
- ⚡ Zero-Dependency Hashing Baseline: Comes with a local token-hashing vectorizer (
HashingEmbeddingProvider) that runs instantly without external APIs or PyTorch downloads. - 🔌 First-Class MCP Client: Connects to live Stdio MCP servers, imports schemas automatically, and executes selected tools under expectation guards.
- 🏷️ Metadata-Aware Filtering: Apply rigid tag filters or restrict tools based on security permissions (
read,write,execute,destructive,network). - 📈 Evaluation Suite: Measure retrieval metrics (
hit_rate@k,top_1_accuracy,MRR,context_tokens_saved) against reproducible benchmark files. - 🧠 Swappable Embedders: Easily swap the hashing provider for local Hugging Face
SentenceTransformersor cloud APIs (OpenAI). - 🔀 Hybrid BM25 + embeddings: Fuses lexical and semantic scores (default 40% BM25) for tool names that do not overlap with the query.
- 🛡️ Read-query safety penalties: Demotes destructive and write-only tools when the task looks read-only.
Installation
Install the core package (includes standard hashing retriever):
pip install semantic-tool-router
Optional extras for advanced embeddings:
# Local models via SentenceTransformers
pip install semantic-tool-router[sentence-transformers]
# OpenAI hosted embedding models
pip install semantic-tool-router[openai]
Quick Start
1. Basic Tool Discovery
Query a local JSON registry of tool specs:
python -m semantic_tool_router discover "read the project README file" --registry examples/tools.json
For production-quality routing, use the quality profile (MiniLM embeddings + cross-encoder reranking):
python -m semantic_tool_router discover "generate a mock logo" \
--registry examples/tools.json \
--profile quality
Or configure embedders manually:
python -m semantic_tool_router discover "generate a mock logo" \
--registry examples/tools.json \
--embedder sentence-transformers \
--embedding-model all-MiniLM-L6-v2 \
--reranker cross-encoder
| Profile | Stack | Best for |
|---|---|---|
fast (default) |
Hashing + BM25 | CI, air-gapped, zero-deps |
quality |
MiniLM + cross-encoder | Balanced production routing |
bge |
BGE-small embeddings | Best live MCP accuracy (94.1% hit@3) |
2. Live MCP Routing
Connect to a live filesystem MCP server, dynamically retrieve the top-3 candidate tools matching your task, and execute the selected tool with safety parameters:
python -m semantic_tool_router mcp-discover \
"read the first lines of the project README" \
--top-k 3 \
--profile quality \
--allow-permission read \
--expect-tool read_text_file \
--call-argument "path=README.md" \
--call-argument "head=8" \
--server npx -y @modelcontextprotocol/server-filesystem .
Integrations
Use the router as a preprocessing step inside standard orchestrator loops to save prompt tokens:
- LangChain Agent Integration: See the langchain_integration.py template.
- LlamaIndex Agent Integration: See the llamaindex_integration.py template.
Benchmarking & Evaluation
Evaluate your router configuration on fixture datasets:
python -m semantic_tool_router benchmark \
--registry examples/tools.json \
--tasks benchmarks/tasks.json \
--top-k 3
Compare retrievers on the frozen fixture and live MCP suites:
python -m semantic_tool_router compare-retrievers \
--registry examples/tools.json \
--tasks benchmarks/tasks.json \
--suite benchmarks/live_mcp_suite.json \
--markdown-output benchmarks/results/comparison.md
Retrieval input ablation:
python -m semantic_tool_router ablation \
--registry examples/tools.json \
--tasks benchmarks/tasks.json \
--markdown-output benchmarks/results/ablation.md
Downstream agent evaluation:
python -m semantic_tool_router agent-eval \
--registry examples/tools.json \
--tasks benchmarks/tasks.json \
--profile quality \
--selector rank1
# Live MCP suite (51 tasks)
python -m semantic_tool_router agent-eval \
--live \
--suite benchmarks/live_mcp_suite.json \
--profile bge \
--markdown-output benchmarks/results/agent_eval_live.md
Use --fixture-only for a fast CI-friendly run without MCP servers.
Latest results: benchmarks/results/comparison.md — 51 live MCP tasks with --profile bge + tool enrichment: 98.0% hit@3, 92.2% top-1.
Research artifacts:
To run the reproducible baseline benchmark suite across four official live MCP reference servers (Filesystem, Memory, Sequential Thinking, and Everything):
python -m semantic_tool_router mcp-benchmark \
--suite benchmarks/live_mcp_suite.json \
--workspace . \
--markdown-output benchmarks/results/live_mcp_baseline.md
Testing
Run unit tests locally across mock registry and MCP environments:
python -m unittest discover -s tests
Contributing & Development
Contributions are welcome! See CONTRIBUTING.md, docs/benchmark-contributing.md, and docs/research-plan.md.
- Fork the repo and clone locally.
- Setup tests:
python -m pip install -e .[sentence-transformers,openai] - Ensure CI checks pass:
python -m unittest discover -s tests - If you change retrieval behavior, run
compare-retrieversand include before/after numbers in your PR.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file semantic_tool_router-0.3.0.tar.gz.
File metadata
- Download URL: semantic_tool_router-0.3.0.tar.gz
- Upload date:
- Size: 37.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5d8d8882f28c1d6ee5a7f2eb9f1031db6751a382a42d8fcc2fe6aba3a0fa1bf
|
|
| MD5 |
c3d5d2231169dcdc1495db669caa73b7
|
|
| BLAKE2b-256 |
089dcae9f370d390b4c9881de018308f17e1c90c833e72ee9ec29dbf74dd67df
|
File details
Details for the file semantic_tool_router-0.3.0-py3-none-any.whl.
File metadata
- Download URL: semantic_tool_router-0.3.0-py3-none-any.whl
- Upload date:
- Size: 34.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
363f400c94e131af69fa33ed1afb49a7dc08b7ebc6e8d52ec934a667303ba59d
|
|
| MD5 |
601e39ae6d6466d2eef190d534e4383b
|
|
| BLAKE2b-256 |
e952ddbdc1eb7925d870d82a9c4e7ff1161e7690b387ae2f7fafa36d5ab6cf06
|