Intelligent middleware for AI agent tool orchestration

These details have not been verified by PyPI

Project links

Project description

ToolFusion

Stop wasting tokens. Start fusing results.

ToolFusion is async-first middleware that sits between your AI agent framework and your tools. It eliminates redundant tool calls, deduplicates overlapping results, resolves conflicting outputs across tools, and fuses multi-source data — making every agent faster, cheaper, and more reliable.

The Problem

Every production AI agent system hits the same wall:

Agent: "What's the weather in NYC?"
  → Tool A: calls OpenWeatherMap API          ← $0.001, 200ms
  → Tool B: calls WeatherAPI                  ← $0.001, 300ms
  → Tool A again (retry/loop): same API call  ← $0.001, 200ms   ← WASTED
  → Agent context now has 3 overlapping results with conflicting temperatures
  → LLM processes all 3 → extra tokens → higher cost → possible hallucination

Problem	Impact	How ToolFusion Fixes It
Duplicate tool calls — LLMs call the same tool repeatedly with identical/similar params	Wasted API calls, latency, cost	Exact + semantic caching, request coalescing
Redundant results in context — multiple tools return overlapping info	Token waste, context pollution	Two-stage deduplication (SimHash → semantic)
Conflicting tool outputs — two tools report different values for the same fact	LLM hallucinates or picks arbitrarily	Source-weighted conflict resolution
Thundering herds — cache expiry causes simultaneous re-execution storms	Backend overload, spikes	TTL jitter + single-flight coalescing
No visibility — developers can't see what tools are doing	Silent failures, impossible debugging	Structured telemetry + result envelopes
Framework lock-in — caching techniques are framework-specific	Can't reuse across LangChain, CrewAI, OpenAI, etc.	Framework-agnostic middleware with adapters

Before vs After ToolFusion

# ❌ WITHOUT ToolFusion — every call hits the API, duplicates pile up
results = []
for query in ["python async", "python asynchronous", "async python"]:
    result = await web_search(query)   # 3 API calls for ~same query
    results.append(result)             # 3 overlapping results in context
# Agent processes all 3 → wasted tokens, possible conflicts

# ✅ WITH ToolFusion — duplicates caught, results fused
tf = ToolFusion(preset="balanced")

@tf.tool(cache_mode="semantic_ok", ttl=600)
async def web_search(query: str) -> dict:
    return await call_search_api(query)

r1 = await tf.call("web_search", {"query": "python async"})        # executes
r2 = await tf.call("web_search", {"query": "python async"})        # L1 cache hit (0ms)
r3 = await tf.call("web_search", {"query": "python asynchronous"}) # L2 semantic hit (~1ms)

fused = await tf.fuse([r1, r3])  # deduplicates + merges
# → 1 API call instead of 3, clean unified result, full provenance

Install

pip install toolfusion

Optional extras for production use:

# Redis cache backend + FAISS vector search
pip install "toolfusion[redis,faiss]"

# Full stack: Redis, FAISS, spaCy NER, sentence-transformers, OpenTelemetry, LLM fusion
pip install "toolfusion[all]"

Available extras

Extra	Packages	Use Case
`redis`	`redis[hiredis]`	Distributed cache backend + vector search
`faiss`	`faiss-cpu`	Fast similarity search for semantic cache
`spacy`	`spacy`	Entity extraction for fusion/conflict resolution
`transformers`	`sentence-transformers`, `torch`	Higher-accuracy embeddings (`accurate` preset)
`otel`	`opentelemetry-api`, `opentelemetry-sdk`	Distributed tracing and metrics
`llm`	`openai`	LLM-based fusion strategy
`all`	All of the above	Everything
`dev`	`pytest`, `pytest-asyncio`, `ruff`	Development/testing

Quick Start

Async (recommended)

import asyncio
from toolfusion import ToolFusion

async def main():
    async with ToolFusion(preset="balanced") as tf:

        @tf.tool(cache_mode="semantic_ok", ttl=300, freshness="daily")
        async def search(query: str) -> dict:
            # Your actual tool implementation
            return {"results": [f"Result for: {query}"]}

        # First call — executes the tool
        r1 = await tf.call("search", {"query": "python async patterns"})
        print(r1.cache_info.source)  # "miss"

        # Identical call — instant L1 cache hit
        r2 = await tf.call("search", {"query": "python async patterns"})
        print(r2.cache_info.source)  # "l1_cache"

        # Similar call — semantic L2 cache hit
        r3 = await tf.call("search", {"query": "python asynchronous patterns"})
        print(r3.cache_info.source)  # "l2_cache"

asyncio.run(main())

Sync

from toolfusion import ToolFusion

with ToolFusion(preset="fast") as tf:

    @tf.tool(cache_mode="exact_only", ttl=60)
    def calculate(x: int, y: int) -> int:
        return x + y

    r = calculate(2, 3)
    print(r.result)  # 5

Framework Adapters

ToolFusion works with any agent framework:

# LangChain
from toolfusion.adapters import langchain_adapter
wrapped_tools = langchain_adapter.wrap(your_langchain_tools, preset="balanced")

# OpenAI Agents SDK
from toolfusion.adapters import openai_adapter
wrapped_fn = openai_adapter.wrap(your_tool_function, preset="balanced")

# CrewAI
from toolfusion.adapters import crewai_adapter
wrapped_tools = crewai_adapter.wrap(your_crewai_tools, preset="balanced")

# AutoGen
from toolfusion.adapters import autogen_adapter
wrapped_fn = autogen_adapter.wrap(your_autogen_function, preset="balanced")

# MCP (Model Context Protocol)
from toolfusion.adapters import mcp_adapter
wrapped_server = mcp_adapter.wrap(your_mcp_server, preset="balanced")

# Haystack
from toolfusion.adapters import haystack_adapter
wrapped_component = haystack_adapter.wrap(your_component, preset="balanced")

How It Works

Agent Tool Call
      │
      ▼
┌─────────────────────────────┐
│  Request Interceptor        │  canonical key + secret redaction
│  ┌───────────────────────┐  │
│  │  L1 Cache (Exact)     │  │  sub-ms hash lookup
│  └───────────┬───────────┘  │
│         miss │              │
│  ┌───────────────────────┐  │
│  │  Single-Flight Gate   │  │  coalesce concurrent duplicate calls
│  └───────────┬───────────┘  │
│  ┌───────────────────────┐  │
│  │  L2 Cache (Semantic)  │  │  embedding similarity search
│  └───────────┬───────────┘  │
│         miss │              │
│  ┌───────────────────────┐  │
│  │  Tool Execution       │  │  actual tool call + circuit breaker
│  └───────────┬───────────┘  │
│  ┌───────────────────────┐  │
│  │  VAAC Admission       │  │  should we cache this result?
│  └───────────┬───────────┘  │
│  ┌───────────────────────┐  │
│  │  Dedup + Fusion       │  │  remove overlaps, resolve conflicts
│  └───────────┬───────────┘  │
│              ▼              │
│  Result Envelope            │  structured output with provenance
└─────────────────────────────┘

Configuration

Presets

Choose a preset that controls the speed/accuracy tradeoff:

Preset	Embedder	Semantic Threshold	Dedup	Fusion	Best For
`fast`	Model2Vec	0.88	SimHash only	Heuristic	High-throughput, cost-sensitive
`balanced`	Model2Vec	0.92	Hybrid (SimHash→semantic)	Heuristic	General production (default)
`accurate`	sentence-transformers	0.95	Hybrid, conservative	Heuristic + optional LLM	Medical, financial, research
`exact_only`	None	N/A	Exact hash only	Union	Maximum safety, latency-critical

tf = ToolFusion(preset="balanced")  # or "fast", "accurate", "exact_only"

Per-Tool Policy

Every tool can override the global preset:

@tf.tool(
    cache_mode="semantic_ok",     # off | exact_only | semantic_ok | semantic_verify
    risk="medium",                # low | medium | high (high → no semantic cache)
    freshness="daily",            # static | daily | realtime | evented
    ttl=600,                      # cache TTL in seconds
    reliability_weight=0.8,       # 0.0–1.0, used in conflict resolution
    cacheable=True,               # False → never cache (for write/mutation tools)
    max_result_size=524288,       # max bytes to cache
    dedup_strategy="hybrid",      # exact | simhash | minhash | semantic | hybrid | none
    volatile_fields=["request_id", "timestamp"],  # stripped before cache key
    depends_on=["other_tool"],    # invalidation cascade
)
async def my_tool(query: str) -> dict:
    ...

YAML Configuration

Generate a config file:

toolfusion init --preset balanced

This creates toolfusion.yaml with all settings. See USAGE.md for the full config reference.

tf = ToolFusion(config="toolfusion.yaml")

Result Envelope

Every call returns a ToolFusionResult — never a raw value:

result = await tf.call("my_tool", {"query": "test"})

result.result              # The actual tool output
result.cache_info.source   # "miss" | "l1_cache" | "l1_cache_stale" | "l2_cache"
result.cache_info.hit      # True/False
result.latency.total_ms    # End-to-end latency
result.latency.execute_ms  # Time spent in actual tool execution
result.sources             # Provenance: which tools contributed
result.conflicts           # Detected conflicts with resolution details
result.dedup_stats          # How many duplicates were removed
result.errors              # Per-tool errors with retryable flag
result.degraded            # True if result is partial due to failures
result.tokens_saved        # Estimated tokens saved by caching/dedup
result.metadata            # Your custom metadata

Multi-Tool Fusion

When multiple tools return data for the same query, fuse them:

r1 = await tf.call("weather_api_a", {"city": "NYC"})
r2 = await tf.call("weather_api_b", {"city": "NYC"})

fused = await tf.fuse([r1, r2])
print(fused.result)                          # Merged, deduplicated result
print(fused.conflicts)                       # Any conflicting values with resolution
print(fused.dedup_stats.duplicates_removed)  # Overlapping data removed

CLI

toolfusion doctor              # Check dependencies and runtime config
toolfusion init                # Generate toolfusion.yaml
toolfusion config --interactive  # Interactive config wizard
toolfusion stats               # Runtime statistics
toolfusion stats --live        # Live dashboard
toolfusion bench               # Run benchmark suite
toolfusion bench --compare     # Compare all presets
toolfusion cache inspect       # Inspect cache entries
toolfusion cache clear         # Clear all caches
toolfusion explain <key>       # Explain a specific cache entry

Key Features

Single-Flight Request Coalescing

When 10 concurrent calls request the same tool with the same parameters, only 1 actually executes. The other 9 wait and share the result:

# 10 concurrent identical calls → only 1 execution
results = await asyncio.gather(*[
    tf.call("slow_api", {"q": "test"}) for _ in range(10)
])
# All 10 get the same result, but the API was called only once

Uses asyncio.shield() to prevent follower cancellation from killing the leader
Configurable leader timeout prevents stuck requests
Background sweeper cleans up orphaned entries

Two-Stage Deduplication

When multiple results overlap, ToolFusion removes redundancy in two stages:

Stage 1 (Fast): SimHash/MinHash fingerprinting for O(1) near-duplicate detection
Stage 2 (Precise): Embedding-based semantic similarity for confirmed duplicates

From each cluster of duplicates, the most informative representative is kept (longest, most entities, most recent).

VAAC Cache Admission

Not every result is worth caching. VAAC (Value-Aware Admission Control) uses a multi-armed bandit to decide:

High-latency tools → more valuable to cache
Stable results → more cacheable
Frequently called tools → benefit more from caching
Write/mutation tools → never cached (enforced)

Stale-While-Revalidate

For eligible tools, expired cache entries are served immediately while a background refresh happens:

@tf.tool(freshness="daily", cache_mode="semantic_ok")
async def news_feed(topic: str) -> dict:
    ...
# After TTL expires: serves stale result instantly, refreshes in background

Circuit Breaker

Tools that fail repeatedly are temporarily disabled:

After 5 consecutive failures → circuit opens (tool calls rejected immediately)
After 30s recovery timeout → circuit half-opens (one test call allowed)
On success → circuit closes (normal operation resumes)

Project Layout

toolfusion/
├── core.py               # Main ToolFusion class
├── cli.py                # CLI commands (typer)
├── bench.py              # Benchmark suite
├── adapters/             # Framework adapters (LangChain, OpenAI, CrewAI, etc.)
├── cache/                # L1/L2 cache backends (memory, SQLite, Redis)
│   ├── backends/         # Cache storage implementations
│   └── vector/           # Vector index implementations (numpy, FAISS, Redis)
├── components/           # Core algorithms
│   ├── admission.py      # VAAC cache admission policy
│   ├── dedup.py          # Two-stage deduplication
│   ├── embedder.py       # Embedding providers (Model2Vec, sentence-transformers)
│   ├── fusion.py         # Cross-tool fusion + conflict resolution
│   └── key_builder.py    # Canonical cache key construction
├── config/               # Configuration system + presets
├── engine/               # Runtime, single-flight, telemetry
├── execution/            # Tool executor + circuit breaker
├── infra/                # Factories, utilities
├── orchestration/        # Pipeline orchestration logic
├── schema/               # Data models, protocols, errors
└── security/             # HMAC signing for cache integrity

Docs

Document	Description
API Reference	Complete API documentation with all parameters
Usage Guide	Practical how-to guide with examples
Spec	Full technical specification
Changelog	Version history
Contributing	How to contribute

Research

ToolFusion's design is informed by peer-reviewed research:

Paper / Source	Key Finding	ToolFusion Application
ToolCaching (arXiv:2601.15335)	VAAC: 11% higher hit ratio, 34% lower latency vs LRU/LFU	Cache admission engine
GPT Semantic Cache (arXiv:2411.05276)	68.8% API call reduction, 97%+ accuracy	L2 semantic cache
Model2Vec (MinishLab)	500x faster embeddings, ~30MB, numpy-only	Default embedder
Discord singleflight	7.6x RPS improvement	Request coalescing

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toolfusion-0.1.0.tar.gz (86.2 kB view details)

Uploaded Feb 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

toolfusion-0.1.0-py3-none-any.whl (81.8 kB view details)

Uploaded Feb 12, 2026 Python 3

File details

Details for the file toolfusion-0.1.0.tar.gz.

File metadata

Download URL: toolfusion-0.1.0.tar.gz
Upload date: Feb 12, 2026
Size: 86.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for toolfusion-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ba5c98ddb86b4fe7165b49c9638ef077b250ca292bc1408ec4d974f0267446e6`
MD5	`b14f15ef0d2c220c09494d2164f5ad98`
BLAKE2b-256	`4300a6ebd2f2908cd9f7ea85c052b3900c20103b2b3dc8ae895e80f76152d5e1`

See more details on using hashes here.

File details

Details for the file toolfusion-0.1.0-py3-none-any.whl.

File metadata

Download URL: toolfusion-0.1.0-py3-none-any.whl
Upload date: Feb 12, 2026
Size: 81.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for toolfusion-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d9f3acb9b7aaeae9e2fe8937112f094a99a63dc6df3a6c43a242e926e5f30424`
MD5	`6871586c393277fd0e0bc5a9df435e16`
BLAKE2b-256	`b4617d8fb04d2973f29e04053703d7afbf59e567ab1800a9de7c2fb39e3e19b8`

See more details on using hashes here.

toolfusion 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ToolFusion

The Problem

Before vs After ToolFusion

Install

Quick Start

Async (recommended)

Sync

Framework Adapters

How It Works

Configuration

Presets

Per-Tool Policy

YAML Configuration

Result Envelope

Multi-Tool Fusion

CLI

Key Features

Project Layout

Docs

Research

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes