Client-side BM25 tool search for LLM APIs — Anthropic, OpenAI-compatible, and MCP
Project description
Dehydrator
Client-side BM25 tool search for LLM APIs. Use thousands of tools without bloating the context window.
Works with Anthropic, OpenAI, and any OpenAI-compatible provider (Groq, OpenRouter, Chutes, etc.). Accepts tools from MCP servers natively.
The problem
LLM APIs require you to send all tool definitions in every request. With 100+ tools, this wastes tokens and degrades tool selection. Anthropic offers a server-side tool_search_tool_bm25, but it's not available on all platforms (e.g. Bedrock) and doesn't work with ZDR. Dehydrator gives you the same capability client-side, so it works everywhere — with any provider.
How it works
Dehydrator wraps your LLM client and replaces the full tool list with a single tool_search tool. When the model needs a tool, it searches by description. Dehydrator intercepts the call, runs BM25 locally, and re-calls the API with only the matched tools injected.
User request
│
▼
┌─────────────────────────────┐
│ API call #1 │
│ tools = [tool_search] │
│ │
│ Model responds: │
│ tool_search("send email") │
└─────────────┬───────────────┘
│ intercepted by Dehydrator
▼
┌─────────────────────────────┐
│ BM25 search (local) │
│ → matches: send_email, │
│ send_slack_message │
└─────────────┬───────────────┘
│
▼
┌─────────────────────────────┐
│ API call #2 │
│ tools = [tool_search, │
│ send_email, │
│ send_slack_message]│
│ │
│ Model responds: │
│ send_email({...}) │
└─────────────────────────────┘
│
▼
Returned to you
Only the tools the model actually needs are ever sent. Discovered tools persist across turns within a conversation.
Installation
pip install dehydrator
Quick start
Anthropic
import anthropic
from dehydrator import DehydratedClient
client = DehydratedClient(
anthropic.Anthropic(),
tools=tools,
top_k=5,
)
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
)
The response is a standard anthropic.types.Message.
OpenAI-compatible (OpenAI, Groq, OpenRouter, Chutes, etc.)
from openai import OpenAI
from dehydrator import OpenAIDehydratedClient
client = OpenAIDehydratedClient(
OpenAI(),
tools=tools,
top_k=5,
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
)
Works with any client that implements client.chat.completions.create(). No openai import required — fully duck-typed.
MCP tools
Tools from MCP servers use inputSchema (camelCase) instead of input_schema. Dehydrator accepts both formats automatically:
# MCP format tools work directly
tools = [
{"name": "get_weather", "description": "...", "inputSchema": {...}},
]
client = DehydratedClient(anthropic.Anthropic(), tools=tools)
# Or use mcp.types.Tool objects with ToolIndex.from_mcp()
from dehydrator import ToolIndex
tools = await session.list_tools() # returns list[mcp.types.Tool]
index = ToolIndex.from_mcp(tools, top_k=5)
API
DehydratedClient(client, tools, *, top_k=5, always_available=None, max_search_rounds=3)
Wraps an anthropic.Anthropic client.
| Parameter | Type | Description |
|---|---|---|
client |
anthropic.Anthropic |
An Anthropic SDK client instance |
tools |
list[dict] |
Tool definitions (Anthropic or MCP format) |
top_k |
int |
Max tools returned per search (default: 5) |
always_available |
list[str] |
Tool names to include in every request, bypassing search |
max_search_rounds |
int |
Max search iterations per create() call (default: 3) |
Methods
client.messages.create(**kwargs)— Same signature as the Anthropic SDK. Thetoolskwarg is ignored (Dehydrator manages tools). Returnsanthropic.types.Message.client.reset_discoveries()— Clears discovered tools. Call this when starting a new conversation.client.inner— Access the underlyinganthropic.Anthropicclient.
AsyncDehydratedClient
Same API as DehydratedClient, but wraps anthropic.AsyncAnthropic and create() is async.
OpenAIDehydratedClient(client, tools, *, top_k=5, always_available=None, max_search_rounds=3)
Wraps any OpenAI-compatible client.
| Parameter | Type | Description |
|---|---|---|
client |
any | Any client with client.chat.completions.create() |
tools |
list[dict] |
Tool definitions (Anthropic or MCP format — converted to OpenAI format automatically) |
top_k |
int |
Max tools returned per search (default: 5) |
always_available |
list[str] |
Tool names to include in every request, bypassing search |
max_search_rounds |
int |
Max search iterations per create() call (default: 3) |
Methods
client.chat.completions.create(**kwargs)— Same signature as the OpenAI SDK. Thetoolskwarg is ignored. Returns the provider's response object.client.reset_discoveries()— Clears discovered tools.client.inner— Access the underlying client.
AsyncOpenAIDehydratedClient
Same API as OpenAIDehydratedClient, but create() is async.
ToolIndex
The BM25 index is also available standalone if you want to use it directly.
from dehydrator import ToolIndex
index = ToolIndex(tools, top_k=5)
matched_names = index.search("weather forecast")
matched_tools = index.get_tools(matched_names)
# From MCP Tool objects
index = ToolIndex.from_mcp(mcp_tools, top_k=5)
Always-available tools
Some tools should always be in context (e.g. a help tool). Pass their names to always_available:
client = DehydratedClient(
anthropic.Anthropic(),
tools=tools,
always_available=["help", "get_current_user"],
)
These tools are sent in every request without requiring a search.
Multi-turn conversations
Discovered tools persist across calls to create(). If the model found send_email in turn 1, it's still available in turn 2 without re-searching.
Call client.reset_discoveries() when starting a new conversation:
# Turn 1: model discovers send_email
response = client.messages.create(...)
# Turn 2: send_email is still available
response = client.messages.create(...)
# New conversation
client.reset_discoveries()
Benchmarks
Benchmarked against 139 real tool definitions from 6 popular MCP servers (Chrome DevTools, GitHub, Playwright, Filesystem, Git, Notion).
Token savings
Sending all tools in every request is expensive. Dehydrator replaces them with a single tool_search tool and only injects the tools the model actually needs:
| Tools | top_k=3 | top_k=5 | top_k=10 | Baseline |
|---|---|---|---|---|
| 50 | 274 tokens (94%) | 349 tokens (93%) | 678 tokens (86%) | 4,864 |
| 100 | 274 tokens (97%) | 349 tokens (96%) | 678 tokens (92%) | 8,954 |
| 200 | 274 tokens (98%) | 349 tokens (98%) | 678 tokens (96%) | 18,159 |
With 200 tools and top_k=5, you go from 18,159 → 349 tokens per request — a 98% reduction.
Search quality
BM25 finds the right tools reliably across all 6 MCP servers:
| Metric | k=3 | k=5 | k=10 |
|---|---|---|---|
| Precision@k | 51.1% | 32.7% | 17.3% |
| Recall@k | 88.6% | 95.3% | 98.3% |
| MRR | 95.8% |
30/30 test queries found at least one correct tool in the top 10. The right tool is ranked #1 or #2 in almost every case.
Run the benchmarks
uv run python benchmarks/search_quality.py # local, no API key
uv run python benchmarks/token_savings_openai.py # local, uses tiktoken
Limitations
- No streaming —
stream=TrueraisesNotImplementedError. Planned for a future release. - Reserved tool name — You cannot have a tool named
tool_search. Dehydrator will raiseValueErrorif you do.
Development
git clone https://github.com/Arrmlet/dehydrator.git
cd dehydrator
uv sync
uv run pytest # tests
uv run ruff check src/ # lint
uv run mypy src/ # type check
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dehydrator-0.2.0.tar.gz.
File metadata
- Download URL: dehydrator-0.2.0.tar.gz
- Upload date:
- Size: 26.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dacf43706598cc4c664edcd65fc1c6c0da0e4d549cbde6ce63a465f53a989418
|
|
| MD5 |
f421066793c03fb3e5b7d545f83510a8
|
|
| BLAKE2b-256 |
9c04be26fb94496656640df59af8e0f7d443731140d0c9c5df0a1a12c7befbe7
|
File details
Details for the file dehydrator-0.2.0-py3-none-any.whl.
File metadata
- Download URL: dehydrator-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6c3ba52a1f8677becc309b115a29b0cd083848f6537233880e9974870334f30
|
|
| MD5 |
eb9b4da47bab6c5b843a106390d933a1
|
|
| BLAKE2b-256 |
64ad7dcb4ca9e73a01be47683fbd86ba1eead95f6d5953d7059212e247325ae4
|