Provider-agnostic LLM router. Pick the cheapest capable model per prompt with rule-based scoring. Wraps LiteLLM for format conversion + streaming.

These details have not been verified by PyPI

Project links

Project description

smart-llm-router

Provider-agnostic LLM router. Pick the cheapest capable model per prompt with rule-based scoring. Wraps LiteLLM for format conversion, streaming, tool calls, and 100+ provider integrations.

Why

Every LLM proxy today routes based on a model name you pick. This one picks the model for you — locally, in <1ms, with zero ML — by scoring the prompt across 14 dimensions (code presence, reasoning markers, multi-step patterns, multilingual keywords, etc.) and mapping to one of four tiers (SIMPLE / MEDIUM / COMPLEX / REASONING).

You bring an upstream (OpenRouter, Together, Fireworks, Groq, Anthropic direct, vLLM, Ollama — anything OpenAI-compatible). It does the rest.

Install

pip install smart-llm-router

Two console scripts ship with the package: smart-llm-router (full name) and slr (short alias).

Quick start with OpenRouter (default upstream)

# 1. Get an OpenRouter key at https://openrouter.ai/keys
export OPENROUTER_API_KEY=sk-or-v1-...
export LITELLM_MASTER_KEY=sk-anything    # gates the proxy itself

# 2. Start the proxy on :4000 (uses bundled OpenRouter config by default)
smart-llm-router start

In another terminal — any OpenAI-compatible client works:

from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:4000/v1", api_key="sk-anything")

# Smart routing — rule-based scorer picks the cheapest capable model
resp = client.chat.completions.create(
    model="smart/auto",
    messages=[{"role": "user", "content": "prove that sqrt(2) is irrational step by step"}],
)
# → routed to REASONING tier (e.g. deepseek/deepseek-r1)

Or curl:

curl http://127.0.0.1:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-anything" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart/auto","messages":[{"role":"user","content":"hi"}]}'

Inspect routing without dispatching

slr test "what is the capital of france"
# → SIMPLE / google/gemini-2.5-flash-lite / 100% savings vs claude-sonnet-4.6

slr test "Prove that sqrt(2) is irrational step by step"
# → REASONING / deepseek/deepseek-r1 / 90% savings

slr test "design a high-availability microservices architecture" --profile premium
# → COMPLEX / anthropic/claude-opus-4.7

slr models --profile auto    # show the tier→model table

Pointing at a different upstream

The bundled config targets OpenRouter, but anything OpenAI-compatible works (Together, Fireworks, Groq, DeepInfra, vLLM, Ollama, OpenAI direct). Copy the bundled YAML and edit api_base / api_key:

# Copy the bundled config to your working directory
python -c "from importlib.resources import files; import shutil; shutil.copy(files('smart_llm_router') / 'default_config.yaml', './smart-llm-router.yaml')"

# Edit smart-llm-router.yaml — swap api_base / api_key per model_list entry
# Then start with --config
smart-llm-router start --config smart-llm-router.yaml

Available routing profiles

`model` value	Behavior
`smart/auto`	Rule-based scoring → cheapest capable model
`smart/eco`	Rule-based scoring → cheapest tier table (free + lite models)
`smart/premium`	Rule-based scoring → quality-first tier table (Claude Sonnet/Opus, GPT-4o, o1)
`smart/agentic`	Rule-based scoring → tool-use-friendly tier table (auto-engaged when `tools[]` present)
`smart/free`	Forces only free/local models
`<provider>/<model>`	Bypasses routing, dispatches directly

Pin a specific model (no routing)

Pass a concrete model ID and the router leaves it alone:

client.chat.completions.create(
    model="anthropic/claude-sonnet-4.6",   # always Sonnet
    messages=[...]
)

client.chat.completions.create(
    model="anthropic/claude-opus-4.7",     # always Opus
    messages=[...]
)

client.chat.completions.create(
    model="openai/gpt-4o",                 # always GPT-4o
    messages=[...]
)

Models pre-wired in the bundled config: anthropic/claude-haiku-4.5, anthropic/claude-sonnet-4.6, anthropic/claude-opus-4.6, anthropic/claude-opus-4.7, openai/gpt-4o, openai/gpt-4o-mini, openai/o1, openai/o3, openai/o3-mini, openai/o4-mini, google/gemini-2.5-flash-lite, google/gemini-2.5-flash, google/gemini-2.5-pro, google/gemini-2.0-flash-lite-001, deepseek/deepseek-chat, deepseek/deepseek-r1, meta-llama/llama-3.3-70b-instruct. Add more by editing the model_list in your config YAML.

Use with Claude Code

Claude Code respects ANTHROPIC_BASE_URL. Point it at the proxy:

export ANTHROPIC_BASE_URL=http://127.0.0.1:4000
export ANTHROPIC_AUTH_TOKEN=sk-anything   # the proxy's master key
claude

Then inside Claude Code: /model anthropic/claude-opus-4.7 to pin Opus, or /model smart/premium to let the router pick the best Claude per request.

How it works

Client sends OpenAI/Anthropic/Gemini-format request to localhost:4000.
LiteLLM Proxy parses; SmartRouterHook.async_pre_call_hook intercepts.
If model is a smart/* profile, the rule-based router scores the prompt and picks a concrete upstream model ID.
LiteLLM dispatches to the configured upstream — handling format conversion, streaming, tool calls, retries, etc.

Routing internals

The classifier (smart_llm_router/router/rules.py) scores each prompt across 14 weighted dimensions:

Dimension	Weight	Detects
reasoningMarkers	0.18	`prove`, `theorem`, `step by step`, `证明`, `теорема`, ...
codePresence	0.15	```, `function`, `class`, `SELECT`, `异步`, ...
multiStepPatterns	0.12	"first ... then", "step 1", "1. "
technicalTerms	0.10	`algorithm`, `architecture`, `kubernetes`, ...
tokenCount	0.08	<50 tok ⇒ -1, >500 ⇒ +1
creativeMarkers	0.05	"write a story/poem"
questionComplexity	0.05	count of `?`
constraintCount	0.04	"must", "exactly", "at most"
agenticTask	0.04	"edit file", "deploy", "install", "verify"
imperativeVerbs	0.03	"implement", "build", "fix"
outputFormat	0.03	`json`, `yaml`, `table`, `schema`
referenceComplexity	0.02	"above", "below", "the docs"
domainSpecificity	0.02	`quantum`, `fpga`, `homomorphic`, ...
simpleIndicators	0.02	"what is", "hello" → negative
negationComplexity	0.01	"not", "without", "except"

Keyword sets are multilingual — EN + ZH + JA + RU + DE + ES + PT + KO + AR — so the same scorer works across 9 languages without translation.

The score maps to a tier through three boundaries:

< 0.0   → SIMPLE        0.3-0.5 → COMPLEX
0.0-0.3 → MEDIUM        > 0.5   → REASONING

Plus three hard overrides: 2+ reasoning keywords ⇒ force REASONING; >100k tokens ⇒ force COMPLEX; system prompt mentioning json/schema ⇒ floor at MEDIUM.

Attribution

The 14-dimension rule-based router in smart_llm_router/router/ is ported from ClawRouter (MIT). Format conversion and streaming come from LiteLLM (MIT).

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Apr 29, 2026

0.1.1

Apr 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_llm_router-0.1.2.tar.gz (31.8 kB view details)

Uploaded Apr 29, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

smart_llm_router-0.1.2-py3-none-any.whl (29.8 kB view details)

Uploaded Apr 29, 2026 Python 3

File details

Details for the file smart_llm_router-0.1.2.tar.gz.

File metadata

Download URL: smart_llm_router-0.1.2.tar.gz
Upload date: Apr 29, 2026
Size: 31.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for smart_llm_router-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`cdfc88c2cdb1a22bf0caa167e0c7660ba7878d685f756ebfe4b694c78b8b2f0e`
MD5	`e4de8037bf92e9dd5db6bb0892c23a43`
BLAKE2b-256	`851fc7c6a99ea8a3dbfb034af3ea823df3b7cabbcb559221e15c94c090a2b5f2`

See more details on using hashes here.

File details

Details for the file smart_llm_router-0.1.2-py3-none-any.whl.

File metadata

Download URL: smart_llm_router-0.1.2-py3-none-any.whl
Upload date: Apr 29, 2026
Size: 29.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for smart_llm_router-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3e5461c723a282e9a502dfa8962e21e636488b3b76059ea1bc0be21b9894dac2`
MD5	`6fac359432521fb881199f3ed1bcc7da`
BLAKE2b-256	`3fa3bd8e74956367bdf73bccb79fe273d7e425d2696826caeda68e97780805b6`

See more details on using hashes here.

smart-llm-router 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

smart-llm-router

Why

Install

Quick start with OpenRouter (default upstream)

Inspect routing without dispatching

Pointing at a different upstream

Available routing profiles

Pin a specific model (no routing)

Use with Claude Code

How it works

Routing internals

Attribution

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes