Graph-structured tool retrieval for LLM agents — zero-dependency, ontology-aware hybrid search

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

graph-tool-call

LLM agents can't fit thousands of tool definitions into context.
Vector search finds similar tools, but misses the workflow they belong to.
graph-tool-call builds a tool graph and retrieves the right chain — not just one match.

	Without retrieval	graph-tool-call
248 tools (K8s API)	12% accuracy	82% accuracy
1068 tools (GitHub full API)	context overflow	78% Recall@5
Token usage	8,192 tok	1,699 tok (79% ↓)

_{Measured with qwen3:4b (4-bit) — full benchmark}

English · 한국어 · 中文 · 日本語

Table of Contents

Why
How it works
Installation
Quick Start
Choose your integration
Benchmark
Advanced Features
Documentation
Contributing

Why

LLM agents need tools. But as tool count grows, two things break:

Context overflow — 248 Kubernetes API endpoints = 8,192 tokens of tool definitions. The LLM chokes and accuracy drops to 12%.
Vector search misses workflows — Searching "cancel my order" finds cancelOrder, but the actual flow is listOrders → getOrder → cancelOrder → processRefund. Vector search returns one tool; you need the chain.

graph-tool-call solves both. It models tool relationships as a graph, retrieves multi-step workflows via hybrid search (BM25 + graph traversal + embedding + MCP annotations), and cuts token usage by 64–91% while maintaining or improving accuracy.

Scenario	Vector-only	graph-tool-call
"cancel my order"	Returns `cancelOrder`	`listOrders → getOrder → cancelOrder → processRefund`
"read and save file"	Returns `read_file`	`read_file` + `write_file` (COMPLEMENTARY relation)
"delete old records"	Returns any tool matching "delete"	Destructive tools ranked first via MCP annotations
"now cancel it" (after listing orders)	No context from history	Demotes used tools, boosts next-step tools
Multiple Swagger specs with overlapping tools	Duplicate tools in results	Cross-source auto-deduplication
1,200 API endpoints	Slow, noisy results	Categorized + graph traversal for precise retrieval

How it works

OpenAPI / MCP / Python functions → Ingest → Build tool graph → Hybrid retrieve → Agent

Example — User says "cancel my order and process a refund"

Vector search finds cancelOrder. But the actual workflow is:

                    ┌──────────┐
          PRECEDES  │listOrders│  PRECEDES
         ┌─────────┤          ├──────────┐
         ▼         └──────────┘          ▼
   ┌──────────┐                    ┌───────────┐
   │ getOrder │                    │cancelOrder│
   └──────────┘                    └─────┬─────┘
                                        │ COMPLEMENTARY
                                        ▼
                                 ┌──────────────┐
                                 │processRefund │
                                 └──────────────┘

graph-tool-call returns the entire chain, not just one tool. Retrieval combines four signals via weighted Reciprocal Rank Fusion (wRRF):

BM25 — keyword matching
Graph traversal — relation-based expansion (PRECEDES, REQUIRES, COMPLEMENTARY)
Embedding similarity — semantic search (optional, any provider)
MCP annotations — read-only / destructive / idempotent hints

Installation

The core package has zero dependencies — just Python standard library. Install only what you need:

pip install graph-tool-call                # core (BM25 + graph) — no dependencies
pip install graph-tool-call[embedding]     # + embedding, cross-encoder reranker
pip install graph-tool-call[openapi]       # + YAML support for OpenAPI specs
pip install graph-tool-call[mcp]           # + MCP server / proxy mode
pip install graph-tool-call[all]           # everything

All extras

Extra	Installs	When to use
`openapi`	pyyaml	YAML OpenAPI specs
`embedding`	numpy	Semantic search (connect to Ollama/OpenAI/vLLM)
`embedding-local`	numpy, sentence-transformers	Local sentence-transformers models
`similarity`	rapidfuzz	Duplicate detection
`langchain`	langchain-core	LangChain integration
`visualization`	pyvis, networkx	HTML graph export, GraphML
`dashboard`	dash, dash-cytoscape	Interactive dashboard
`lint`	ai-api-lint	Auto-fix bad API specs
`mcp`	mcp	MCP server / proxy mode

Quick Start

Try it in 30 seconds (no install)

uvx graph-tool-call search "user authentication" \
  --source https://petstore.swagger.io/v2/swagger.json

Query: "user authentication"
Source: https://petstore.swagger.io/v2/swagger.json (19 tools)
Results (5):

  1. getUserByName  — Get user by user name
  2. deleteUser     — Delete user
  3. createUser     — Create user
  4. loginUser      — Logs user into the system
  5. updateUser     — Updated user

Python API

from graph_tool_call import ToolGraph

# Build a tool graph from the official Petstore API
tg = ToolGraph.from_url(
    "https://petstore3.swagger.io/api/v3/openapi.json",
    cache="petstore.json",
)
print(tg)
# → ToolGraph(tools=19, nodes=22, edges=100)

# Search for tools
tools = tg.retrieve("create a new pet", top_k=5)
for t in tools:
    print(f"{t.name}: {t.description}")

# Search with workflow guidance
results = tg.retrieve_with_scores("process an order", top_k=5)
for r in results:
    print(f"{r.tool.name} [{r.confidence}]")
    for rel in r.relations:
        print(f"  → {rel.hint}")

# Execute an OpenAPI tool directly
result = tg.execute(
    "addPet", {"name": "Buddy", "status": "available"},
    base_url="https://petstore3.swagger.io/api/v3",
)

Workflow planning

plan_workflow() returns ordered execution chains with prerequisites — reducing agent round-trips from 3-4 to 1.

plan = tg.plan_workflow("process a refund")
for step in plan.steps:
    print(f"{step.order}. {step.tool.name} — {step.reason}")
# 1. getOrder      — prerequisite for requestRefund
# 2. requestRefund — primary action

plan.save("refund_workflow.json")

Edit, parameterize, and visualize workflows — see Direct API guide.

Other tool sources

# From an MCP server (HTTP JSON-RPC tools/list)
tg.ingest_mcp_server("https://mcp.example.com/mcp")

# From an MCP tool list (annotations preserved)
tg.ingest_mcp_tools(mcp_tools, server_name="filesystem")

# From Python callables (type hints + docstrings)
tg.ingest_functions([read_file, write_file])

MCP annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint) are used as retrieval signals — query intent is automatically classified, and read queries prioritize read-only tools while delete queries prioritize destructive tools.

Choose your integration

graph-tool-call ships several integration patterns. Pick the one that matches your stack:

You're using...	Pattern	Token win	Guide
Claude Code / Cursor / Windsurf	MCP Proxy (aggregate N MCP servers → 3 meta-tools)	~1,200 tok/turn	docs/integrations/mcp-proxy.md
Any MCP-compatible client	MCP Server (single source as MCP)	varies	docs/integrations/mcp-server.md
LangChain / LangGraph (50+ tools)	Gateway tools (N tools → 2 meta-tools)	92%	docs/integrations/langchain.md
OpenAI / Anthropic SDK (existing code)	Middleware (1-line monkey-patch)	76–91%	docs/integrations/middleware.md
Direct control over retrieval	Python API (`retrieve()` + format adapter)	varies	docs/integrations/direct-api.md

MCP Proxy (most common)

When you have many MCP servers, their tool names pile up in every LLM turn. Bundle them behind one server: 172 tools → 3 meta-tools.

# 1. Create ~/backends.json listing your MCP servers
# 2. Register the proxy with Claude Code
claude mcp add -s user tool-proxy -- \
  uvx "graph-tool-call[mcp]" proxy --config ~/backends.json

Full setup, passthrough mode, remote transport → MCP Proxy guide.

LangChain Gateway

from graph_tool_call.langchain import create_gateway_tools

# 62 tools from Slack, GitHub, Jira, MS365...
gateway = create_gateway_tools(all_tools, top_k=10)
# → [search_tools, call_tool] — only 2 tools in context

agent = create_react_agent(model=llm, tools=gateway)

92% token reduction vs binding all 62 tools. See LangChain guide for auto-filter and manual patterns.

SDK middleware

from graph_tool_call.middleware import patch_openai

patch_openai(client, graph=tg, top_k=5)  # ← add this one line

# Existing code unchanged — 248 tools go in, only 5 relevant ones are sent
response = client.chat.completions.create(
    model="gpt-4o",
    tools=all_248_tools,
    messages=messages,
)

Also works with Anthropic via patch_anthropic. See Middleware guide.

Benchmark

Two questions: (1) Does the LLM still pick the right tool when given only the retrieved subset? (2) Does the retriever itself rank correct tools in the top K?

Dataset	Tools	Baseline acc	graph-tool-call	Token reduction
Petstore	19	100%	95% (k=5)	64%
GitHub	50	100%	88% (k=5)	88%
Mixed MCP	38	97%	90% (k=5)	83%
Kubernetes core/v1	248	12%	82% (k=5 + ontology)	79%

Key finding — at 248 tools, baseline collapses (context overflow) to 12% while graph-tool-call recovers to 82%. At smaller scales, baseline is already strong, so graph-tool-call's value is token savings without accuracy loss.

→ Full results (pipeline / retrieval-only / competitive / 1068-scale / 200-tool LangChain agent across GPT and Claude): docs/benchmarks.md

# Reproduce
python -m benchmarks.run_benchmark                                # retrieval only
python -m benchmarks.run_benchmark --mode pipeline -m qwen3:4b    # full pipeline

Advanced Features

Embedding-based hybrid search

Add semantic search on top of BM25 + graph. No heavy dependencies needed — connect to any external embedding server.

tg.enable_embedding("ollama/qwen3-embedding:0.6b")        # Ollama (recommended)
tg.enable_embedding("openai/text-embedding-3-large")      # OpenAI
tg.enable_embedding("vllm/Qwen/Qwen3-Embedding-0.6B")     # vLLM
tg.enable_embedding("sentence-transformers/all-MiniLM-L6-v2")  # local
tg.enable_embedding(lambda texts: my_embed_fn(texts))     # custom callable

Weights are auto-rebalanced. See API reference for all provider forms.

Retrieval tuning

tg.enable_reranker()                                      # cross-encoder rerank
tg.enable_diversity(lambda_=0.7)                          # MMR diversity
tg.set_weights(keyword=0.2, graph=0.5, embedding=0.3, annotation=0.2)

History-aware retrieval

Pass previously called tools to demote them and boost next-step candidates.

tools = tg.retrieve("now cancel it", history=["listOrders", "getOrder"])
# → [cancelOrder, processRefund, ...]

Save / load (preserves embeddings + weights)

tg.save("my_graph.json")
tg = ToolGraph.load("my_graph.json")
# Or use cache= in from_url() for automatic save/load
tg = ToolGraph.from_url(url, cache="my_graph.json")

LLM-enhanced ontology

tg.auto_organize(llm="ollama/qwen2.5:7b")
tg.auto_organize(llm="litellm/claude-sonnet-4-20250514")
tg.auto_organize(llm=openai.OpenAI())

Builds richer categories, relations, and search keywords. Supports Ollama, OpenAI clients, litellm, and any callable. See API reference.

Other features

Feature	API	Docs
Duplicate detection across specs	`find_duplicates` / `merge_duplicates`	API ref
Conflict detection	`apply_conflicts`	API ref
Operational analysis	`analyze`	API ref
Interactive dashboard	`dashboard()`	API ref
HTML / GraphML / Cypher export	`export_html` / `export_graphml` / `export_cypher`	API ref
Auto-fix bad OpenAPI specs	`from_url(url, lint=True)`	ai-api-lint

Documentation

Doc	Description
CLI reference	All `graph-tool-call` CLI commands
Python API reference	`ToolGraph` methods, helpers, middleware, LangChain
Integrations	MCP server / proxy, LangChain, middleware, direct API
Benchmark results	Full pipeline / retrieval / competitive / scale tables
Architecture	System overview, pipeline layers, data model
Design notes	Algorithm design — normalization, dependency detection, ontology
Research	Competitive analysis, API scale data
Release checklist	Release process, changelog flow

Contributing

Contributions are welcome.

git clone https://github.com/SonAIengine/graph-tool-call.git
cd graph-tool-call
pip install poetry pre-commit
poetry install --with dev --all-extras
pre-commit install   # auto-runs ruff on every commit

# Test, lint, benchmark
poetry run pytest -v
poetry run ruff check . && poetry run ruff format --check .
python -m benchmarks.run_benchmark -v

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

SonSeongJun

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.20.1

Jun 2, 2026

This version

0.20.0

May 6, 2026

0.19.1

Mar 26, 2026

0.19.0

Mar 24, 2026

0.18.0

Mar 23, 2026

0.17.0

Mar 23, 2026

0.16.0

Mar 22, 2026

0.15.0

Mar 22, 2026

0.14.1

Mar 22, 2026

0.14.0

Mar 22, 2026

0.13.1

Mar 15, 2026

0.13.0

Mar 14, 2026

0.12.1

Mar 14, 2026

0.12.0

Mar 14, 2026

0.11.1

Mar 14, 2026

0.11.0

Mar 14, 2026

0.10.1

Mar 14, 2026

0.10.0

Mar 14, 2026

0.9.0

Mar 13, 2026

0.8.0

Mar 12, 2026

0.7.2

Mar 10, 2026

0.7.1

Mar 10, 2026

0.7.0

Mar 9, 2026

0.6.1

Mar 7, 2026

0.6.0

Mar 7, 2026

0.4.0

Mar 2, 2026

0.3.0

Mar 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph_tool_call-0.20.0.tar.gz (203.5 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

graph_tool_call-0.20.0-py3-none-any.whl (241.4 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file graph_tool_call-0.20.0.tar.gz.

File metadata

Download URL: graph_tool_call-0.20.0.tar.gz
Upload date: May 6, 2026
Size: 203.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graph_tool_call-0.20.0.tar.gz
Algorithm	Hash digest
SHA256	`e0f6d70acee98b2dc44671e2d1630d5fdc27a8429261c38555adeb24adf3c2b1`
MD5	`8d7a88b714297c2142cec84fed014556`
BLAKE2b-256	`93b8b9b7a1aa9a25af4b2f1fb5468656df80bab38d651a9272c4f967ad51e08c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_tool_call-0.20.0.tar.gz:

Publisher: publish.yml on SonAIengine/graph-tool-call

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: graph_tool_call-0.20.0.tar.gz
- Subject digest: e0f6d70acee98b2dc44671e2d1630d5fdc27a8429261c38555adeb24adf3c2b1
- Sigstore transparency entry: 1449201938
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: SonAIengine/graph-tool-call@b4dd0dcec6764fb4402bc9733eab3758baa5b8bf
- Branch / Tag: refs/tags/v0.20.0
- Owner: https://github.com/SonAIengine
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b4dd0dcec6764fb4402bc9733eab3758baa5b8bf
- Trigger Event: release

File details

Details for the file graph_tool_call-0.20.0-py3-none-any.whl.

File metadata

Download URL: graph_tool_call-0.20.0-py3-none-any.whl
Upload date: May 6, 2026
Size: 241.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for graph_tool_call-0.20.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c0816c9b4e2e232bd61d6565e62ffcdfc5e85ba0dab291ac502e59ca7b191ea5`
MD5	`be24b758a13b9a89463e4b67e4185b86`
BLAKE2b-256	`c378b32a8d77442de10498b1f8b53c0be3c1553c0e78a379398bad541d0057d0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_tool_call-0.20.0-py3-none-any.whl:

Publisher: publish.yml on SonAIengine/graph-tool-call

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: graph_tool_call-0.20.0-py3-none-any.whl
- Subject digest: c0816c9b4e2e232bd61d6565e62ffcdfc5e85ba0dab291ac502e59ca7b191ea5
- Sigstore transparency entry: 1449201942
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: SonAIengine/graph-tool-call@b4dd0dcec6764fb4402bc9733eab3758baa5b8bf
- Branch / Tag: refs/tags/v0.20.0
- Owner: https://github.com/SonAIengine
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@b4dd0dcec6764fb4402bc9733eab3758baa5b8bf
- Trigger Event: release

graph-tool-call 0.20.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

graph-tool-call

Why

How it works

Installation

Quick Start

Try it in 30 seconds (no install)

Python API

Workflow planning

Other tool sources

Choose your integration

MCP Proxy (most common)

LangChain Gateway

SDK middleware

Benchmark

Advanced Features

Embedding-based hybrid search

Retrieval tuning

History-aware retrieval

Save / load (preserves embeddings + weights)

LLM-enhanced ontology

Other features

Documentation

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance