Skip to main content

Graph-structured tool retrieval for LLM agents — ontology-aware hybrid search

Project description

graph-tool-call

Tool Lifecycle Management for LLM Agents

Ingest, Analyze, Organize, Retrieve.

License: MIT Python 3.10+ CI

English · 한국어 · 中文 · 日本語


The Problem

LLM agents are getting access to more and more tools. A commerce platform might expose 1,200+ API endpoints. A company's internal toolset might have 500+ functions across multiple services.

But there's a hard limit: you can't put them all in the context window.

The common solution? Vector search — embed tool descriptions, find the closest matches. It works, but it misses something important:

Tools don't exist in isolation. They have relationships.

When a user says "cancel my order and process a refund", vector search might find cancelOrder. But it won't know that you need to call listOrders first (to get the order ID), and that processRefund should follow. These aren't just similar tools — they form a workflow.

The Solution

graph-tool-call models tool relationships as a graph:

                    ┌──────────┐
          PRECEDES  │listOrders│  PRECEDES
         ┌─────────┤          ├──────────┐
         ▼         └──────────┘          ▼
   ┌──────────┐                    ┌───────────┐
   │ getOrder │                    │cancelOrder│
   └──────────┘                    └─────┬─────┘
                                         │ COMPLEMENTARY
                                         ▼
                                  ┌──────────────┐
                                  │processRefund │
                                  └──────────────┘

Instead of treating each tool as an independent vector, graph-tool-call understands:

  • REQUIRESgetOrder needs an ID from listOrders
  • PRECEDES — you must list orders before you can cancel one
  • COMPLEMENTARY — cancellation and refund often go together
  • SIMILAR_TOgetOrder and listOrders serve related purposes
  • CONFLICTS_WITHupdateOrder and deleteOrder shouldn't run together

This means when you search for "cancel order", you don't just get cancelOrder — you get the complete workflow: list → get → cancel → refund.

How It Works

OpenAPI/MCP/Code → [Ingest] → [Analyze] → [Organize] → [Retrieve] → Agent
                    (convert)  (relations)  (graph)     (hybrid)

1. Ingest — Point it at a Swagger spec, MCP server, or Python functions. Tools are auto-converted into a unified schema.

2. Analyze — Relationships are automatically detected: path hierarchies, CRUD patterns, shared schemas, response-parameter chains, state machines.

3. Organize — Tools are grouped into an ontology graph. Two modes:

  • Auto — purely algorithmic (tags, paths, CRUD patterns). No LLM needed.
  • LLM-Auto — enhanced with LLM reasoning (Ollama, vLLM, OpenAI). Better categories, richer relations.

4. Retrieve — Hybrid search that combines keyword matching, graph traversal, and (optionally) embeddings. Works great without any LLM. Works even better with one.

Quick Start

pip install graph-tool-call
from graph_tool_call import ToolGraph

tg = ToolGraph()

# Register tools (OpenAI / Anthropic / LangChain format auto-detected)
tg.add_tools(your_tools_list)

# Define relationships
tg.add_relation("read_file", "write_file", "complementary")

# Retrieve — graph expansion finds related tools automatically
tools = tg.retrieve("read a file and save changes", top_k=5)
# → [read_file, write_file, list_dir, ...]
#    write_file found via COMPLEMENTARY relation, not just vector similarity

From Swagger / OpenAPI

from graph_tool_call import ToolGraph

tg = ToolGraph()
tg.ingest_openapi("tests/fixtures/petstore_swagger2.json")
# Supports: Swagger 2.0, OpenAPI 3.0, OpenAPI 3.1
# Accepts: file path (JSON/YAML), URL, or raw dict

# Automatic: 5 endpoints → 5 tools → CRUD relations → categories
# Dependencies, call ordering, category groupings — all auto-detected.

tools = tg.retrieve("create a new pet", top_k=5)
# → [createPet, getPet, updatePet, listPets, deletePet]
#    Graph expansion brings the full CRUD workflow

From Swagger UI URL

from graph_tool_call import ToolGraph

# Auto-discovers all API groups from Swagger UI
tg = ToolGraph.from_url("https://api.example.com/swagger-ui/index.html")

# Also works with direct spec URLs
tg = ToolGraph.from_url("https://api.example.com/v3/api-docs")

tools = tg.retrieve("search products", top_k=5)

from_url() automatically detects Swagger UI pages, discovers all spec groups via swagger-config, and ingests them into a single unified graph. Operations without descriptions get auto-generated fallbacks from their HTTP method, path, and tags.

From Python Functions

def read_file(path: str) -> str:
    """Read contents of a file."""

def write_file(path: str, content: str) -> None:
    """Write contents to a file."""

tg = ToolGraph()
tg.ingest_functions([read_file, write_file])
# Parameters extracted from type hints, description from docstring

Why Not Just Vector Search?

Scenario Vector-only graph-tool-call
"cancel my order" Returns cancelOrder Returns listOrders → getOrder → cancelOrder → processRefund (full workflow)
"read and save file" Returns read_file Returns read_file + write_file (via COMPLEMENTARY)
Multiple Swagger specs with overlapping tools Duplicate tools in results Auto-deduplication across sources
1,200 API endpoints Slow, noisy results Organized into categories, precise graph traversal

3-Tier Search: Use What You Have

graph-tool-call is designed to work without any LLM and get better with one:

Tier What you need What it does Improvement
0 Nothing BM25 keywords + graph expansion + RRF fusion Baseline
1 Small LLM (1.5B~3B) + query expansion, synonyms, translation Recall +15~25%
2 Full LLM (7B+) + intent decomposition, iterative refinement Recall +30~40%

Even a tiny model running on Ollama (qwen2.5:1.5b) can meaningfully improve search quality. No GPU required for Tier 0.

Feature Comparison

Feature Vector-only solutions graph-tool-call
Tool source Manual registration Auto-ingest from Swagger/OpenAPI/MCP
Search method Flat vector similarity Graph + vector hybrid (RRF), 3-Tier
Tool relations None 6 relation types, auto-detected
Call ordering None State machine + CRUD workflow detection
Deduplication None Cross-source duplicate detection
Ontology None Auto / LLM-Auto modes
Visualization None Graph dashboard with manual editing
LLM dependency Required Optional (better with, works without)

Roadmap

Phase What Status
0 Core graph engine + hybrid retrieval ✅ Done (39 tests)
1 OpenAPI ingest, BM25+RRF retrieval, dependency detection ✅ Done (88 tests)
2 Deduplication, embeddings, ontology modes (Auto/LLM-Auto), search tiers, from_url() ✅ Done (181 tests)
3 MCP ingest, Pyvis visualization, Neo4j export, CLI, PyPI publish Planned
4 Interactive dashboard (Dash Cytoscape), manual editing, community Planned

Documentation

Doc Description
Architecture System overview, pipeline layers, data model
WBS Work Breakdown Structure — Phase 0~4 progress
Design Algorithm design — spec normalization, dependency detection, search modes, call ordering, ontology modes
Research Competitive analysis, API scale data, commerce patterns
OpenAPI Guide How to write API specs that produce better tool graphs

Contributing

Contributions are welcome!

# Development setup
git clone https://github.com/SonAIengine/graph-tool-call.git
cd graph-tool-call
pip install poetry
poetry install --with dev

# Run tests
poetry run pytest -v

# Lint
poetry run ruff check .

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph_tool_call-0.3.0.tar.gz (42.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

graph_tool_call-0.3.0-py3-none-any.whl (52.8 kB view details)

Uploaded Python 3

File details

Details for the file graph_tool_call-0.3.0.tar.gz.

File metadata

  • Download URL: graph_tool_call-0.3.0.tar.gz
  • Upload date:
  • Size: 42.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graph_tool_call-0.3.0.tar.gz
Algorithm Hash digest
SHA256 07afe9192cb6b1dfa6ea77704a8c8f2273ed9d5300e041062c1462304e13c232
MD5 fbed2947678851bae1ecaae859408db8
BLAKE2b-256 1af67b9c11acc46a857a2dfdf20ef189af93efbb1d3efc7c188ed757841bfb82

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_tool_call-0.3.0.tar.gz:

Publisher: publish.yml on SonAIengine/graph-tool-call

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file graph_tool_call-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for graph_tool_call-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3d34674d5f1e7e3644f80fbb0591d10261327fdde98ba72846bccdf26ac93208
MD5 70660f3cb70f1d49d7552b4ba0cae5aa
BLAKE2b-256 af5a731a3219f466673cad9334fefe4d0d6889f49a60b81036bbb8863a15f8d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for graph_tool_call-0.3.0-py3-none-any.whl:

Publisher: publish.yml on SonAIengine/graph-tool-call

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page