Embedding-based scaffold router for Claude API. Routes tasks to the right scaffold using centroid matching. By Hermes Labs.

These details have not been verified by PyPI

Project links

Project description

claude-router

Route Claude API calls to the cheapest model that works. 5 validated scaffolds, embedding-based task classification in ~10ms. Validated on 300+ blind-judged API calls.

Results

Task	Best Setup	Cost	Quality vs. Baseline
Eval/scoring	Haiku + scaffold	$0.06	MAE 1.0 (vs Sonnet raw: 1.2)
Research	Sonnet + scaffold	$0.28	8.49/10 (vs Opus raw: 7.45)
Content	Haiku + scaffold	$0.06	4/5 blind wins vs Sonnet
Code review	Sonnet (raw)	$0.28	8.7/10 (vs Opus: 8.1)

Anti-findings

These are the blocker issues. The router handles them automatically:

Scaffolds break operational tasks (0/9 success). Haiku treats constraints as meta-instructions instead of executing tasks.
Scaffolds hurt coding (4.9 vs 6.4 raw). Don't scaffold code review, design, or debugging.
Opus doesn't scaffold. Safety-critical evals need Opus raw (MAE 0.0), not scaffolded.

The routing table avoids these entirely: no scaffolds on operational, coding, safety-critical, or conversation tasks.

Install

Requires: Python 3.10+, requests, numpy, and Ollama running locally with nomic-embed-text.

pip install claude-router
ollama pull nomic-embed-text

Quick start

from claude_router import ClaudeRouter

router = ClaudeRouter()
result = router.route("Evaluate this research paper for methodological rigor")

print(result["model"])           # claude-haiku-4-5
print(result["scaffold_key"])    # calibrated-scoring
print(result["cost_per_1k"])     # 0.0008

# Build prompt with scaffold prepended
prompt = router.build_prompt("Evaluate this research paper...")
# → Pass prompt as system message to Anthropic API

Or CLI:

python router.py "Write a blog post about Q2 results"

How it works

Embed your prompt using nomic-embed-text (~5ms)
Compare against pre-computed task-category centroids
Look up routing table: category → model + scaffold
Return model ID and scaffold text

No LLM calls for routing. All locally in ~10ms. Low confidence (router accuracy 74% on 26-prompt benchmark) defaults to Opus.

The 5 scaffolds

Each scaffold is validated through blind evaluation. They work by constraining the model's output space to the task structure.

See scaffolds.json for full text and evidence:

calibrated-scoring: Integer 1-10, cite evidence, not generous/critical
insight-first: Lead non-obvious, concrete recs, 3-4 sentences
plan-first: g:goal;c:constraints;s:steps;r:risks prefix
substance-check: Real gaps not surface, name issue and location
bug-hunt: Specific bugs, line numbers, severity, one-line fix

Routing table

eval              → Haiku   + calibrated-scoring
research          → Sonnet  + insight-first
content           → Haiku   + insight-first
analytical_review → Haiku   + substance-check
search            → Haiku   + plan-first

coding            → Sonnet  (raw)
operational       → Sonnet  (raw)
status_check      → Haiku   (raw)
conversation      → Opus    (raw)
safety_critical   → Opus    (raw)

Low confidence → Opus (safe default).

Cost math

For 10,000 Claude API calls/month:

Strategy	Cost	Quality
All Opus	$6,800	Baseline
All Sonnet	$2,800	Lower on eval, equal on code
claude-router	~$620	Equal or better on eval/research/content

Customization

Swap scaffolds, centroids, or routing table:

router = ClaudeRouter(
    centroids_path="my_centroids.json",
    routing_table_path="my_routing.json",
    scaffolds_path="my_scaffolds.json"
)

Limitations

Requires Ollama locally (for embeddings)
Centroids trained on one task distribution — test on your workload
Router misclassifies 26% of tasks — low confidence defaults to Opus
Anti-findings are real: scaffolds on coding/operational make things worse
Lite mode (Haiku-first routing for max savings) planned for v1.1

Evidence

Benchmarks: benchmarks/ | Raw citations: scaffolds.json | License: MIT

Key experiments: 4-condition code/research crossover, scaffolds-vs-operational stress test, scaffolded Sonnet beats Opus 75% on research (6/8 blind wins, 140 API calls).

Hermes Labs Ecosystem

claude-router is part of the Hermes Labs open-source suite:

lintlang — Static linter for AI agent tool descriptions and system prompts
little-canary — Prompt injection detection
zer0dex — Dual-layer memory for AI agents
zer0lint — mem0 extraction diagnostics
suy-sideguy — Autonomous agent watchdog
quickthink — Planning scaffolding for local LLMs

Need this calibrated to your pipeline? Open an issue or reach out to Hermes Labs for custom scaffolds and production integration.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

Apr 16, 2026

0.1.0

Apr 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claude_router-1.0.0.tar.gz (208.8 kB view details)

Uploaded Apr 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

claude_router-1.0.0-py3-none-any.whl (102.0 kB view details)

Uploaded Apr 16, 2026 Python 3

File details

Details for the file claude_router-1.0.0.tar.gz.

File metadata

Download URL: claude_router-1.0.0.tar.gz
Upload date: Apr 16, 2026
Size: 208.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for claude_router-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`db71dca924209adaed4730caeee5b9052c283ffd08c40521643fb2b018ebcb0c`
MD5	`d9d45103c25833497fe548b9f3f3e8d8`
BLAKE2b-256	`4403ff87a3b08bd973690368aa84e0c75447719f753ab1f0d3c2a4a674785d70`

See more details on using hashes here.

File details

Details for the file claude_router-1.0.0-py3-none-any.whl.

File metadata

Download URL: claude_router-1.0.0-py3-none-any.whl
Upload date: Apr 16, 2026
Size: 102.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for claude_router-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5a9c4ebed00468d4c039d2d17e1e77d9d6a38f964cdc40891c6d90cc1b8f476c`
MD5	`cc68addf7b05df118c6ba3d7c7bf54e0`
BLAKE2b-256	`05fb61f8576e79c93560fd26f08bb4e818c8ff5ebdedd48a63a5b279665ab3fa`

See more details on using hashes here.

claude-router 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

claude-router

Results

Anti-findings

Install

Quick start

How it works

The 5 scaffolds

Routing table

Cost math

Customization

Limitations

Evidence

Hermes Labs Ecosystem

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes