Skip to main content

Runtime semantic discovery for agent tools.

Project description

Semantic Tool Router

PyPI CI License Python Development Status

Dynamic runtime tool discovery and retrieval-augmented routing for AI agents.

Semantic Tool Router is a dependency-light library designed to manage the "Many-Tool" problem in LLM and Agentic workflows. Instead of exposing every available tool or Model Context Protocol (MCP) server schema to a model context window (which increases costs and degrades accuracy), it embeds tools based on their descriptions and dynamically retrieves a focused candidate set ($top-k$) for the current task.


When to use this

Use Semantic Tool Router when… Skip it when…
You have 20+ tools or multiple MCP servers You have fewer than ~10 tools — pass them all
Prompt token cost or context limits matter You need guaranteed correctness without retrieval risk
You want measurable routing before trusting an agent Every tool must always be visible to the model
You need permission-aware filtering (read, write, destructive) Tool schemas are identical and interchangeable

This is a preprocessing layer for LangChain, LlamaIndex, or custom agent loops — not another orchestration framework.


How It Works

graph LR
    Query[Task Query] --> Router(Tool Router)
    Registry[Tool Registry] --> Router
    Router --> Filters{Filters}
    Filters --> LLM[LLM Context]
  1. Tool Indexing: Tool descriptions, schemas, tags, examples, and permissions are compiled into search strings and vectorized.
  2. Semantic Matching: The user query is embedded and compared against the indexed tools using cosine similarity.
  3. Metadata Filtering: Results are filtered by permission layers (e.g. read-only vs destructive commands) or specific tags.
  4. Context Injection: Only the top $k$ relevant tool schemas are injected into the LLM system prompt, preserving context tokens.

Features

  • Zero-Dependency Hashing Baseline: Comes with a local token-hashing vectorizer (HashingEmbeddingProvider) that runs instantly without external APIs or PyTorch downloads.
  • 🔌 First-Class MCP Client: Connects to live Stdio MCP servers, imports schemas automatically, and executes selected tools under expectation guards.
  • 🏷️ Metadata-Aware Filtering: Apply rigid tag filters or restrict tools based on security permissions (read, write, execute, destructive, network).
  • 📈 Evaluation Suite: Measure retrieval metrics (hit_rate@k, top_1_accuracy, MRR, context_tokens_saved) against reproducible benchmark files.
  • 🧠 Swappable Embedders: Easily swap the hashing provider for local Hugging Face SentenceTransformers or cloud APIs (OpenAI).
  • 🔀 Hybrid BM25 + embeddings: Fuses lexical and semantic scores (default 40% BM25) for tool names that do not overlap with the query.
  • 🛡️ Read-query safety penalties: Demotes destructive and write-only tools when the task looks read-only.

Installation

Install the core package (includes standard hashing retriever):

pip install semantic-tool-router

Optional extras for advanced embeddings:

# Local models via SentenceTransformers
pip install semantic-tool-router[sentence-transformers]

# OpenAI hosted embedding models
pip install semantic-tool-router[openai]

Quick Start

1. Basic Tool Discovery

Query a local JSON registry of tool specs:

python -m semantic_tool_router discover "read the project README file" --registry examples/tools.json

For production-quality routing, use the quality profile (MiniLM embeddings + cross-encoder reranking):

python -m semantic_tool_router discover "generate a mock logo" \
  --registry examples/tools.json \
  --profile quality

Or configure embedders manually:

python -m semantic_tool_router discover "generate a mock logo" \
  --registry examples/tools.json \
  --embedder sentence-transformers \
  --embedding-model all-MiniLM-L6-v2 \
  --reranker cross-encoder
Profile Stack Best for
fast (default) Hashing + BM25 CI, air-gapped, zero-deps
quality MiniLM + cross-encoder Balanced production routing
bge BGE-small embeddings Best live MCP accuracy (94.1% hit@3)

2. Live MCP Routing

Connect to a live filesystem MCP server, dynamically retrieve the top-3 candidate tools matching your task, and execute the selected tool with safety parameters:

python -m semantic_tool_router mcp-discover \
  "read the first lines of the project README" \
  --top-k 3 \
  --profile quality \
  --allow-permission read \
  --expect-tool read_text_file \
  --call-argument "path=README.md" \
  --call-argument "head=8" \
  --server npx -y @modelcontextprotocol/server-filesystem .

Integrations

Use the router as a preprocessing step inside standard orchestrator loops to save prompt tokens:


Benchmarking & Evaluation

Evaluate your router configuration on fixture datasets:

python -m semantic_tool_router benchmark \
  --registry examples/tools.json \
  --tasks benchmarks/tasks.json \
  --top-k 3

Compare retrievers on the frozen fixture and live MCP suites:

python -m semantic_tool_router compare-retrievers \
  --registry examples/tools.json \
  --tasks benchmarks/tasks.json \
  --suite benchmarks/live_mcp_suite.json \
  --markdown-output benchmarks/results/comparison.md

Retrieval input ablation:

python -m semantic_tool_router ablation \
  --registry examples/tools.json \
  --tasks benchmarks/tasks.json \
  --markdown-output benchmarks/results/ablation.md

Downstream agent evaluation:

python -m semantic_tool_router agent-eval \
  --registry examples/tools.json \
  --tasks benchmarks/tasks.json \
  --profile quality \
  --selector rank1

# Live MCP suite (51 tasks)
python -m semantic_tool_router agent-eval \
  --live \
  --suite benchmarks/live_mcp_suite.json \
  --profile bge \
  --markdown-output benchmarks/results/agent_eval_live.md

Use --fixture-only for a fast CI-friendly run without MCP servers.

Latest results: benchmarks/results/comparison.md51 live MCP tasks with --profile bge + tool enrichment: 98.0% hit@3, 92.2% top-1.

Research artifacts:

To run the reproducible baseline benchmark suite across four official live MCP reference servers (Filesystem, Memory, Sequential Thinking, and Everything):

python -m semantic_tool_router mcp-benchmark \
  --suite benchmarks/live_mcp_suite.json \
  --workspace . \
  --markdown-output benchmarks/results/live_mcp_baseline.md

Testing

Run unit tests locally across mock registry and MCP environments:

python -m unittest discover -s tests

Contributing & Development

Contributions are welcome! See CONTRIBUTING.md, docs/benchmark-contributing.md, and docs/research-plan.md.

  1. Fork the repo and clone locally.
  2. Setup tests: python -m pip install -e .[sentence-transformers,openai]
  3. Ensure CI checks pass: python -m unittest discover -s tests
  4. If you change retrieval behavior, run compare-retrievers and include before/after numbers in your PR.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantic_tool_router-0.3.0.tar.gz (37.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semantic_tool_router-0.3.0-py3-none-any.whl (34.0 kB view details)

Uploaded Python 3

File details

Details for the file semantic_tool_router-0.3.0.tar.gz.

File metadata

  • Download URL: semantic_tool_router-0.3.0.tar.gz
  • Upload date:
  • Size: 37.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.6

File hashes

Hashes for semantic_tool_router-0.3.0.tar.gz
Algorithm Hash digest
SHA256 c5d8d8882f28c1d6ee5a7f2eb9f1031db6751a382a42d8fcc2fe6aba3a0fa1bf
MD5 c3d5d2231169dcdc1495db669caa73b7
BLAKE2b-256 089dcae9f370d390b4c9881de018308f17e1c90c833e72ee9ec29dbf74dd67df

See more details on using hashes here.

File details

Details for the file semantic_tool_router-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for semantic_tool_router-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 363f400c94e131af69fa33ed1afb49a7dc08b7ebc6e8d52ec934a667303ba59d
MD5 601e39ae6d6466d2eef190d534e4383b
BLAKE2b-256 e952ddbdc1eb7925d870d82a9c4e7ff1161e7690b387ae2f7fafa36d5ab6cf06

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page