Anthropic's Advisor Strategy as a drop-in DeepAgents middleware — pair a powerful advisor with a fast executor

These details have not been verified by PyPI

Project links

Project description

`advisor-middleware`

Anthropic's Advisor Strategy as a drop-in DeepAgents middleware.

Problem • How it works • Quick start • Configuration • Benchmark

Open-source implementation of Anthropic's Advisor Strategy — a pattern that pairs a fast, cheap executor model with a powerful advisor model. The executor runs end-to-end; the advisor is consulted only on critical decisions. Result: better performance at lower cost.

The Advisor Strategy

advisor-middleware makes this a single import for DeepAgents. It handles provider detection, native API routing, fallback invocation, cost guardrails, and context curation — so you just plug it in and your agents get smarter.

The Problem

Traditional sub-agent pattern	Advisor Strategy
Large orchestrator decomposes work into tasks	Small executor drives end-to-end
Expensive model runs every turn	Expensive model consulted only when needed
Worker pools + orchestration overhead	Zero orchestration — just a tool call
Hard to predict costs	`max_uses` guardrail caps advisor spend

"It makes better architectural decisions on complex tasks while adding no overhead on simple ones. The plans and trajectories are night and day different." — Eric Simmons, CEO and Founder

How It Works

flowchart TD
    A["Executor (Sonnet/Haiku)"] -->|"Runs every turn"| B{"Stuck on a\nhard decision?"}
    B -->|No| C["Continue executing\n(read, write, search, execute)"]
    C --> A
    B -->|Yes| D["Call advisor tool"]
    D --> E["Advisor (Opus)\nReviews shared context"]
    E -->|"Returns plan/correction/stop"| F["Executor resumes\nwith guidance"]
    F --> A

    style A fill:#fff,stroke:#333,color:#333
    style E fill:#c0392b,color:#fff
    style C fill:#2d6a4f,color:#fff
    style F fill:#2d6a4f,color:#fff

The middleware operates in two modes depending on the executor's provider:

Native mode (Anthropic executor): Injects the advisor_20260301 server-side tool spec. The API handles everything internally — zero extra round-trips, zero overhead on simple turns.
Fallback mode (any provider): Exposes an advisor tool backed by a direct LLM call to the advisor model. Works with any executor/advisor combination.

Quick Start

Install

pip install advisor-middleware

# Or from source
pip install git+https://github.com/emanueleielo/advisor-middleware.git

Minimal — zero config

from deepagents import create_deep_agent
from advisor_middleware import AdvisorMiddleware

mw = AdvisorMiddleware(advisor_model="claude-opus-4-6")

agent = create_deep_agent(
    model="anthropic:claude-sonnet-4-6",
    system_prompt="You are a senior software engineer.",
    backend=backend,
    middleware=[mw],
)

That's it. Sonnet executes, Opus advises. The middleware auto-detects Anthropic and uses the native API tool — no extra configuration needed.

Cross-provider

from advisor_middleware import AdvisorMiddleware, AdvisorConfig

mw = AdvisorMiddleware(
    config=AdvisorConfig(
        advisor_model="anthropic:claude-opus-4-6",
        prefer_native=False,  # force fallback mode
        max_uses_per_turn=2,
    ),
)

agent = create_deep_agent(
    model="openai:gpt-4o",  # any provider as executor
    middleware=[mw],
)

With compact-middleware

from advisor_middleware import AdvisorMiddleware
from compact_middleware import CompactionMiddleware, CompactionToolMiddleware

advisor_mw = AdvisorMiddleware(advisor_model="claude-opus-4-6")
compact_mw = CompactionMiddleware(model="anthropic:claude-sonnet-4-6", backend=backend)
compact_tool_mw = CompactionToolMiddleware(compact_mw)

agent = create_deep_agent(
    model="anthropic:claude-sonnet-4-6",
    backend=backend,
    middleware=[advisor_mw, compact_mw, compact_tool_mw],
)

Configuration

`AdvisorConfig`

Parameter	Type	Default	Description
`advisor_model`	`str \| BaseChatModel`	`"claude-opus-4-6"`	Advisor model ID or resolved instance
`max_uses_per_turn`	`int`	`3`	Max advisor calls per agent turn
`max_uses_per_session`	`int \| None`	`None`	Lifetime cap (None = unlimited)
`prefer_native`	`bool`	`True`	Use native `advisor_20260301` when possible
`max_tokens`	`int`	`1024`	Max tokens the advisor can generate per consultation
`temperature`	`float`	`1.0`	Advisor temperature (fallback mode only)
`advisor_system_prompt`	`str \| None`	`None`	Override advisor prompt (fallback only)
`context`	`ContextCurationConfig`	(see below)	Controls context forwarded to advisor

`ContextCurationConfig`

Parameter	Type	Default	Description
`include_system_prompt`	`bool`	`True`	Forward executor's system prompt
`include_tool_results`	`bool`	`True`	Include tool results in context
`max_context_messages`	`int \| None`	`None`	Limit messages sent to advisor
`max_context_chars`	`int \| None`	`None`	Hard character budget for context

Cost control example

config = AdvisorConfig(
    max_uses_per_turn=2,        # max 2 consultations per turn
    max_uses_per_session=10,    # max 10 total in the session
    context=ContextCurationConfig(
        max_context_messages=10,  # only last 10 messages
        include_tool_results=False,  # skip bulky tool outputs
    ),
)

Native vs Fallback

	Native (`advisor_20260301`)	Fallback (LLM call)
When	Anthropic executor + `prefer_native=True`	Any other executor, or `prefer_native=False`
How	Server-side tool spec injected into API call	Direct LLM call to advisor model
Round-trips	0 extra (handled by API)	1 per consultation
Overhead	Zero on simple turns	Minimal (only when called)
Model freedom	Anthropic advisor only	Any model as advisor
Context curation	Handled by API	Configurable via `ContextCurationConfig`

The middleware auto-detects the executor's provider and routes accordingly. You can force fallback mode with prefer_native=False for full control over context curation.

Benchmark

We tested with a real debugging task: a 4-file async task queue system (connection pool + circuit breaker + rate limiter + retry logic) with interacting bugs that cause tasks to be silently dropped under load. The agent must read all files, diagnose cross-component interactions, and fix every bug.

python examples/benchmark.py

Results: Haiku solo vs Haiku + Opus Advisor

	Haiku Solo	Haiku + Opus Advisor
Tests passing	11/12	12/12
Turns	11	6
File writes	7 (3 rewrites)	3 (all correct first try)
Advisor calls	0	1
Duration	210.7s	90.3s

What happened: Haiku solo rewrote connection.py four times, going in circles trying to fix the semaphore leak. It never solved the circuit breaker recovery issue.

Haiku + Advisor consulted Opus once after reading all files. Opus confirmed the bug diagnosis, corrected a proposed fix, and flagged an issue Haiku missed. Haiku then wrote all three fixes correctly on the first attempt.

Why it works

The advisor doesn't help on simple tasks — Haiku handles routine reads, writes, and obvious fixes alone. The value shows on cross-file reasoning where Haiku gets stuck in trial-and-error loops:

Opus identified that a semaphore release was needed for each discarded connection, not just one
Opus correctly noted the circuit breaker race condition is a non-issue under asyncio's GIL (avoiding an unnecessary lock)
Opus flagged that rate-limit timeouts should requeue tasks without consuming retry attempts

One well-timed consultation eliminated multiple cycles of incorrect rewrites.

Anthropic's benchmarks

From the original blog post:

Config	SWE-bench Multilingual	Cost per task
Sonnet + Opus Advisor	+2.7pp vs Sonnet solo	-11.9%
Haiku + Opus Advisor (BrowseComp)	41.2% vs 19.7% solo	85% cheaper than Sonnet

Introspection

Track advisor usage programmatically:

mw = AdvisorMiddleware(advisor_model="claude-opus-4-6")

# ... after agent execution ...

print(f"Total consultations: {mw.get_total_uses()}")
print(f"Total advisor tokens: {mw.get_total_advisor_tokens()}")

for event in mw.get_events():
    print(f"  Turn {event['turn']}: {event['strategy']} — {event['advisor_tokens']} tokens")
    print(f"    Q: {event['question'][:80]}...")
    print(f"    A: {event['advice'][:80]}...")

Architecture

advisor_middleware/
├── __init__.py        # Public API: AdvisorMiddleware, AdvisorConfig, ...
├── middleware.py       # Core middleware — dual-mode wrap_model_call
├── config.py          # AdvisorConfig + ContextCurationConfig dataclasses
├── state.py           # AdvisorState + AdvisorEvent TypedDicts
├── prompts.py         # Executor + advisor system prompts
├── providers.py       # Provider detection, native spec, fallback invocation
└── py.typed           # PEP 561 type marker

Development

# Install with dev dependencies
pip install -e ".[dev,deepagents,anthropic]"

# Run tests
pytest

# Lint
ruff check advisor_middleware/

# Type check
mypy advisor_middleware/

License

MIT — Emanuele Ielo

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1

Apr 10, 2026

This version

0.1.0

Apr 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

advisor_middleware-0.1.0.tar.gz (187.6 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

advisor_middleware-0.1.0-py3-none-any.whl (18.8 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file advisor_middleware-0.1.0.tar.gz.

File metadata

Download URL: advisor_middleware-0.1.0.tar.gz
Upload date: Apr 10, 2026
Size: 187.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for advisor_middleware-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ec77cfb122b01aa977ed3560e53d919d83b5b4570aa7e90aa21ba16b8bfeafd2`
MD5	`0757fe12193e21bea78400a3898cd0a7`
BLAKE2b-256	`d724e0e36282fb58ee11a70078959d9a2a309f5f423b414840121d50527ac323`

See more details on using hashes here.

File details

Details for the file advisor_middleware-0.1.0-py3-none-any.whl.

File metadata

Download URL: advisor_middleware-0.1.0-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 18.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for advisor_middleware-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c3866d06717c66e85269c07a80e39f2e331f59be37c1acba98935b0902ac5475`
MD5	`5da9dc53cbc659639e93e6780e740e56`
BLAKE2b-256	`539cc2ae5f3f64f754de260b049d5a2fa91704be04c6cb4e53bec1db44689cdb`

See more details on using hashes here.

advisor-middleware 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

advisor-middleware

Anthropic's Advisor Strategy as a drop-in DeepAgents middleware.

The Problem

How It Works

Quick Start

Install

Minimal — zero config

Cross-provider

With compact-middleware

Configuration

AdvisorConfig

ContextCurationConfig

Cost control example

Native vs Fallback

Benchmark

Results: Haiku solo vs Haiku + Opus Advisor

Why it works

Anthropic's benchmarks

Introspection

Architecture

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`advisor-middleware`

`AdvisorConfig`

`ContextCurationConfig`