Skip to main content

Cost-aware model routing for LangChain agents based on task phase

Project description

langchain-router

PyPI License CI

Your agent doesn't need the expensive model for every call.

Most calls are just the model picking which file to read next or which pattern to search for. A smaller model does that fine. This middleware detects when the agent is doing that kind of work and routes to a fast model automatically.

Phase based routing

Quick Install

pip install langchain-router

🤔 What is this?

Agent sessions have a pattern. The user says something, the agent thinks about it (planning). Then it reads files, searches code, runs commands (execution). Sometimes something breaks (recovery). Then the user says something again.

Planning and recovery need the primary model. Execution doesn't. RouterMiddleware detects which phase the agent is in and routes accordingly.

What just happened Phase Model
User spoke planning primary
Tool call succeeded execution fast
Tool call failed recovery primary

On a simulated 18-call session, 83% of calls route to the fast model.

from langchain.agents import create_agent
from langchain_router import RouterMiddleware

agent = create_agent(
    model="anthropic:claude-sonnet-4-6",
    tools=[...],
    middleware=[RouterMiddleware(fast="anthropic:claude-haiku-4-5-20251001")],
)

With CollapseMiddleware

from langchain_collapse import CollapseMiddleware

middleware = [
    CollapseMiddleware(),
    RouterMiddleware(fast="anthropic:claude-haiku-4-5-20251001"),
]
flowchart TB
    A["📥 37 messages"] --> B["CollapseMiddleware"]
    B --> C["📥 9 messages"]
    C --> D["RouterMiddleware"]
    D --> E{"phase?"}
    E --> |"execution  ·  83%"| F["⚡ Haiku"]
    E --> |"planning"| G["🧠 Sonnet"]
    E --> |"recovery"| G

    style A fill:#ff6b6b,stroke:#e03131,color:#fff
    style B fill:#339af0,stroke:#1c7ed6,color:#fff
    style C fill:#339af0,stroke:#1c7ed6,color:#fff
    style D fill:#51cf66,stroke:#2f9e44,color:#fff
    style E fill:#fff3bf,stroke:#f59f00,color:#333
    style F fill:#20c997,stroke:#099268,color:#fff
    style G fill:#845ef7,stroke:#7048e8,color:#fff

On false positives

The error heuristic checks for error, traceback, exception, failed in tool output. Code containing those words (like def handle_error) routes to the primary model. That's the safe direction: more capability than needed, never less.

📖 Documentation

  • Source (single file, ~170 lines)
  • Benchmark (simulated session with cost breakdown)
  • Tests (unit tests + property based invariant tests)

💁 Contributing

git clone https://github.com/johanity/langchain-router.git
cd langchain-router
pip install -e ".[test]"
pytest

📕 License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_router-0.1.0.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_router-0.1.0-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file langchain_router-0.1.0.tar.gz.

File metadata

  • Download URL: langchain_router-0.1.0.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for langchain_router-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a932eade021dea20f8e2426113e20c853fc9bbc3f5a1cbc6492cefb7d7a59179
MD5 33d05510383086d64059e3f342e9f8e3
BLAKE2b-256 ede41ff483072525d4c9f1a345222ea757f616b5519176709cbe34cceb49cf41

See more details on using hashes here.

File details

Details for the file langchain_router-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_router-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed6847c869e772dca9d60cc8822f6a448ab06d646c62ad18016e246e8de922dc
MD5 8de969dceed7a87b4d12a26342ceb7c9
BLAKE2b-256 83d676820471b83491662a8e0cfd26c68cc8f6e8ee80ee7b58c9cbcfd9899590

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page