Graph-structured tool retrieval for LLM agents — zero-dependency, ontology-aware hybrid search
Project description
graph-tool-call
LLM agents can't fit thousands of tool definitions into context.
Vector search finds similar tools, but misses the workflow they belong to.
graph-tool-call builds a tool graph and retrieves the right chain — not just one match.
| Without retrieval | graph-tool-call | |
|---|---|---|
| 248 tools (K8s API) | 12% accuracy | 82% accuracy |
| 1068 tools (GitHub full API) | context overflow | 78% Recall@5 |
| Token usage | 8,192 tok | 1,699 tok (79% ↓) |
Measured with qwen3:4b (4-bit) — full benchmark
Table of Contents
Why
LLM agents need tools. But as tool count grows, two things break:
- Context overflow — 248 Kubernetes API endpoints = 8,192 tokens of tool definitions. The LLM chokes and accuracy drops to 12%.
- Vector search misses workflows — Searching "cancel my order" finds
cancelOrder, but the actual flow islistOrders → getOrder → cancelOrder → processRefund. Vector search returns one tool; you need the chain.
graph-tool-call solves both. It models tool relationships as a graph, retrieves multi-step workflows via hybrid search (BM25 + graph traversal + embedding + MCP annotations), and cuts token usage by 64–91% while maintaining or improving accuracy.
| Scenario | Vector-only | graph-tool-call |
|---|---|---|
| "cancel my order" | Returns cancelOrder |
listOrders → getOrder → cancelOrder → processRefund |
| "read and save file" | Returns read_file |
read_file + write_file (COMPLEMENTARY relation) |
| "delete old records" | Returns any tool matching "delete" | Destructive tools ranked first via MCP annotations |
| "now cancel it" (after listing orders) | No context from history | Demotes used tools, boosts next-step tools |
| Multiple Swagger specs with overlapping tools | Duplicate tools in results | Cross-source auto-deduplication |
| 1,200 API endpoints | Slow, noisy results | Categorized + graph traversal for precise retrieval |
How it works
OpenAPI / MCP / Python functions → Ingest → Build tool graph → Hybrid retrieve → Agent
Example — User says "cancel my order and process a refund"
Vector search finds cancelOrder. But the actual workflow is:
┌──────────┐
PRECEDES │listOrders│ PRECEDES
┌─────────┤ ├──────────┐
▼ └──────────┘ ▼
┌──────────┐ ┌───────────┐
│ getOrder │ │cancelOrder│
└──────────┘ └─────┬─────┘
│ COMPLEMENTARY
▼
┌──────────────┐
│processRefund │
└──────────────┘
graph-tool-call returns the entire chain, not just one tool. Retrieval combines four signals via weighted Reciprocal Rank Fusion (wRRF):
- BM25 — keyword matching
- Graph traversal — relation-based expansion (PRECEDES, REQUIRES, COMPLEMENTARY)
- Embedding similarity — semantic search (optional, any provider)
- MCP annotations — read-only / destructive / idempotent hints
Installation
The core package has zero dependencies — just Python standard library. Install only what you need:
pip install graph-tool-call # core (BM25 + graph) — no dependencies
pip install graph-tool-call[embedding] # + embedding, cross-encoder reranker
pip install graph-tool-call[openapi] # + YAML support for OpenAPI specs
pip install graph-tool-call[mcp] # + MCP server / proxy mode
pip install graph-tool-call[all] # everything
All extras
| Extra | Installs | When to use |
|---|---|---|
openapi |
pyyaml | YAML OpenAPI specs |
embedding |
numpy | Semantic search (connect to Ollama/OpenAI/vLLM) |
embedding-local |
numpy, sentence-transformers | Local sentence-transformers models |
similarity |
rapidfuzz | Duplicate detection |
langchain |
langchain-core | LangChain integration |
visualization |
pyvis, networkx | HTML graph export, GraphML |
dashboard |
dash, dash-cytoscape | Interactive dashboard |
lint |
ai-api-lint | Auto-fix bad API specs |
mcp |
mcp | MCP server / proxy mode |
Quick Start
Try it in 30 seconds (no install)
uvx graph-tool-call search "user authentication" \
--source https://petstore.swagger.io/v2/swagger.json
Query: "user authentication"
Source: https://petstore.swagger.io/v2/swagger.json (19 tools)
Results (5):
1. getUserByName — Get user by user name
2. deleteUser — Delete user
3. createUser — Create user
4. loginUser — Logs user into the system
5. updateUser — Updated user
Python API
from graph_tool_call import ToolGraph
# Build a tool graph from the official Petstore API
tg = ToolGraph.from_url(
"https://petstore3.swagger.io/api/v3/openapi.json",
cache="petstore.json",
)
print(tg)
# → ToolGraph(tools=19, nodes=22, edges=100)
# Search for tools
tools = tg.retrieve("create a new pet", top_k=5)
for t in tools:
print(f"{t.name}: {t.description}")
# Search with workflow guidance
results = tg.retrieve_with_scores("process an order", top_k=5)
for r in results:
print(f"{r.tool.name} [{r.confidence}]")
for rel in r.relations:
print(f" → {rel.hint}")
# Execute an OpenAPI tool directly
result = tg.execute(
"addPet", {"name": "Buddy", "status": "available"},
base_url="https://petstore3.swagger.io/api/v3",
)
Workflow planning
plan_workflow() returns ordered execution chains with prerequisites — reducing agent round-trips from 3-4 to 1.
plan = tg.plan_workflow("process a refund")
for step in plan.steps:
print(f"{step.order}. {step.tool.name} — {step.reason}")
# 1. getOrder — prerequisite for requestRefund
# 2. requestRefund — primary action
plan.save("refund_workflow.json")
Edit, parameterize, and visualize workflows — see Direct API guide.
Other tool sources
# From an MCP server (HTTP JSON-RPC tools/list)
tg.ingest_mcp_server("https://mcp.example.com/mcp")
# From an MCP tool list (annotations preserved)
tg.ingest_mcp_tools(mcp_tools, server_name="filesystem")
# From Python callables (type hints + docstrings)
tg.ingest_functions([read_file, write_file])
MCP annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint) are used as retrieval signals — query intent is automatically classified, and read queries prioritize read-only tools while delete queries prioritize destructive tools.
Choose your integration
graph-tool-call ships several integration patterns. Pick the one that matches your stack:
| You're using... | Pattern | Token win | Guide |
|---|---|---|---|
| Claude Code / Cursor / Windsurf | MCP Proxy (aggregate N MCP servers → 3 meta-tools) | ~1,200 tok/turn | docs/integrations/mcp-proxy.md |
| Any MCP-compatible client | MCP Server (single source as MCP) | varies | docs/integrations/mcp-server.md |
| LangChain / LangGraph (50+ tools) | Gateway tools (N tools → 2 meta-tools) | 92% | docs/integrations/langchain.md |
| OpenAI / Anthropic SDK (existing code) | Middleware (1-line monkey-patch) | 76–91% | docs/integrations/middleware.md |
| Direct control over retrieval | Python API (retrieve() + format adapter) |
varies | docs/integrations/direct-api.md |
MCP Proxy (most common)
When you have many MCP servers, their tool names pile up in every LLM turn. Bundle them behind one server: 172 tools → 3 meta-tools.
# 1. Create ~/backends.json listing your MCP servers
# 2. Register the proxy with Claude Code
claude mcp add -s user tool-proxy -- \
uvx "graph-tool-call[mcp]" proxy --config ~/backends.json
Full setup, passthrough mode, remote transport → MCP Proxy guide.
LangChain Gateway
from graph_tool_call.langchain import create_gateway_tools
# 62 tools from Slack, GitHub, Jira, MS365...
gateway = create_gateway_tools(all_tools, top_k=10)
# → [search_tools, call_tool] — only 2 tools in context
agent = create_react_agent(model=llm, tools=gateway)
92% token reduction vs binding all 62 tools. See LangChain guide for auto-filter and manual patterns.
SDK middleware
from graph_tool_call.middleware import patch_openai
patch_openai(client, graph=tg, top_k=5) # ← add this one line
# Existing code unchanged — 248 tools go in, only 5 relevant ones are sent
response = client.chat.completions.create(
model="gpt-4o",
tools=all_248_tools,
messages=messages,
)
Also works with Anthropic via patch_anthropic. See Middleware guide.
Benchmark
Two questions: (1) Does the LLM still pick the right tool when given only the retrieved subset? (2) Does the retriever itself rank correct tools in the top K?
| Dataset | Tools | Baseline acc | graph-tool-call | Token reduction |
|---|---|---|---|---|
| Petstore | 19 | 100% | 95% (k=5) | 64% |
| GitHub | 50 | 100% | 88% (k=5) | 88% |
| Mixed MCP | 38 | 97% | 90% (k=5) | 83% |
| Kubernetes core/v1 | 248 | 12% | 82% (k=5 + ontology) | 79% |
Key finding — at 248 tools, baseline collapses (context overflow) to 12% while graph-tool-call recovers to 82%. At smaller scales, baseline is already strong, so graph-tool-call's value is token savings without accuracy loss.
→ Full results (pipeline / retrieval-only / competitive / 1068-scale / 200-tool LangChain agent across GPT and Claude): docs/benchmarks.md
# Reproduce
python -m benchmarks.run_benchmark # retrieval only
python -m benchmarks.run_benchmark --mode pipeline -m qwen3:4b # full pipeline
Advanced Features
Embedding-based hybrid search
Add semantic search on top of BM25 + graph. No heavy dependencies needed — connect to any external embedding server.
tg.enable_embedding("ollama/qwen3-embedding:0.6b") # Ollama (recommended)
tg.enable_embedding("openai/text-embedding-3-large") # OpenAI
tg.enable_embedding("vllm/Qwen/Qwen3-Embedding-0.6B") # vLLM
tg.enable_embedding("sentence-transformers/all-MiniLM-L6-v2") # local
tg.enable_embedding(lambda texts: my_embed_fn(texts)) # custom callable
Weights are auto-rebalanced. See API reference for all provider forms.
Retrieval tuning
tg.enable_reranker() # cross-encoder rerank
tg.enable_diversity(lambda_=0.7) # MMR diversity
tg.set_weights(keyword=0.2, graph=0.5, embedding=0.3, annotation=0.2)
History-aware retrieval
Pass previously called tools to demote them and boost next-step candidates.
tools = tg.retrieve("now cancel it", history=["listOrders", "getOrder"])
# → [cancelOrder, processRefund, ...]
Save / load (preserves embeddings + weights)
tg.save("my_graph.json")
tg = ToolGraph.load("my_graph.json")
# Or use cache= in from_url() for automatic save/load
tg = ToolGraph.from_url(url, cache="my_graph.json")
LLM-enhanced ontology
tg.auto_organize(llm="ollama/qwen2.5:7b")
tg.auto_organize(llm="litellm/claude-sonnet-4-20250514")
tg.auto_organize(llm=openai.OpenAI())
Builds richer categories, relations, and search keywords. Supports Ollama, OpenAI clients, litellm, and any callable. See API reference.
Other features
| Feature | API | Docs |
|---|---|---|
| Duplicate detection across specs | find_duplicates / merge_duplicates |
API ref |
| Conflict detection | apply_conflicts |
API ref |
| Operational analysis | analyze |
API ref |
| Interactive dashboard | dashboard() |
API ref |
| HTML / GraphML / Cypher export | export_html / export_graphml / export_cypher |
API ref |
| Auto-fix bad OpenAPI specs | from_url(url, lint=True) |
ai-api-lint |
Documentation
| Doc | Description |
|---|---|
| CLI reference | All graph-tool-call CLI commands |
| Python API reference | ToolGraph methods, helpers, middleware, LangChain |
| Integrations | MCP server / proxy, LangChain, middleware, direct API |
| Benchmark results | Full pipeline / retrieval / competitive / scale tables |
| Architecture | System overview, pipeline layers, data model |
| Design notes | Algorithm design — normalization, dependency detection, ontology |
| Research | Competitive analysis, API scale data |
| Release checklist | Release process, changelog flow |
Contributing
Contributions are welcome.
git clone https://github.com/SonAIengine/graph-tool-call.git
cd graph-tool-call
pip install poetry pre-commit
poetry install --with dev --all-extras
pre-commit install # auto-runs ruff on every commit
# Test, lint, benchmark
poetry run pytest -v
poetry run ruff check . && poetry run ruff format --check .
python -m benchmarks.run_benchmark -v
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graph_tool_call-0.20.0.tar.gz.
File metadata
- Download URL: graph_tool_call-0.20.0.tar.gz
- Upload date:
- Size: 203.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e0f6d70acee98b2dc44671e2d1630d5fdc27a8429261c38555adeb24adf3c2b1
|
|
| MD5 |
8d7a88b714297c2142cec84fed014556
|
|
| BLAKE2b-256 |
93b8b9b7a1aa9a25af4b2f1fb5468656df80bab38d651a9272c4f967ad51e08c
|
Provenance
The following attestation bundles were made for graph_tool_call-0.20.0.tar.gz:
Publisher:
publish.yml on SonAIengine/graph-tool-call
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graph_tool_call-0.20.0.tar.gz -
Subject digest:
e0f6d70acee98b2dc44671e2d1630d5fdc27a8429261c38555adeb24adf3c2b1 - Sigstore transparency entry: 1449201938
- Sigstore integration time:
-
Permalink:
SonAIengine/graph-tool-call@b4dd0dcec6764fb4402bc9733eab3758baa5b8bf -
Branch / Tag:
refs/tags/v0.20.0 - Owner: https://github.com/SonAIengine
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b4dd0dcec6764fb4402bc9733eab3758baa5b8bf -
Trigger Event:
release
-
Statement type:
File details
Details for the file graph_tool_call-0.20.0-py3-none-any.whl.
File metadata
- Download URL: graph_tool_call-0.20.0-py3-none-any.whl
- Upload date:
- Size: 241.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0816c9b4e2e232bd61d6565e62ffcdfc5e85ba0dab291ac502e59ca7b191ea5
|
|
| MD5 |
be24b758a13b9a89463e4b67e4185b86
|
|
| BLAKE2b-256 |
c378b32a8d77442de10498b1f8b53c0be3c1553c0e78a379398bad541d0057d0
|
Provenance
The following attestation bundles were made for graph_tool_call-0.20.0-py3-none-any.whl:
Publisher:
publish.yml on SonAIengine/graph-tool-call
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
graph_tool_call-0.20.0-py3-none-any.whl -
Subject digest:
c0816c9b4e2e232bd61d6565e62ffcdfc5e85ba0dab291ac502e59ca7b191ea5 - Sigstore transparency entry: 1449201942
- Sigstore integration time:
-
Permalink:
SonAIengine/graph-tool-call@b4dd0dcec6764fb4402bc9733eab3758baa5b8bf -
Branch / Tag:
refs/tags/v0.20.0 - Owner: https://github.com/SonAIengine
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@b4dd0dcec6764fb4402bc9733eab3758baa5b8bf -
Trigger Event:
release
-
Statement type: