Skip to main content

Provider-agnostic LLM router. Pick the cheapest capable model per prompt with rule-based scoring. Wraps LiteLLM for format conversion + streaming.

Project description

smart-llm-router

Provider-agnostic LLM router. Pick the cheapest capable model per prompt with rule-based scoring. Wraps LiteLLM for format conversion, streaming, tool calls, and 100+ provider integrations.

Why

Every LLM proxy today routes based on a model name you pick. This one picks the model for you — locally, in <1ms, with zero ML — by scoring the prompt across 14 dimensions (code presence, reasoning markers, multi-step patterns, multilingual keywords, etc.) and mapping to one of four tiers (SIMPLE / MEDIUM / COMPLEX / REASONING).

You bring an upstream (OpenRouter, Together, Fireworks, Groq, Anthropic direct, vLLM, Ollama — anything OpenAI-compatible). It does the rest.

Install

pip install smart-llm-router

Two console scripts ship with the package: smart-llm-router (full name) and slr (short alias).

Quick start with OpenRouter (default upstream)

# 1. Get an OpenRouter key at https://openrouter.ai/keys
export OPENROUTER_API_KEY=sk-or-v1-...
export LITELLM_MASTER_KEY=sk-anything    # gates the proxy itself

# 2. Start the proxy on :4000 (uses bundled OpenRouter config by default)
smart-llm-router start

In another terminal — any OpenAI-compatible client works:

from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:4000/v1", api_key="sk-anything")

# Smart routing — rule-based scorer picks the cheapest capable model
resp = client.chat.completions.create(
    model="smart/auto",
    messages=[{"role": "user", "content": "prove that sqrt(2) is irrational step by step"}],
)
# → routed to REASONING tier (e.g. deepseek/deepseek-r1)

Or curl:

curl http://127.0.0.1:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-anything" \
  -H "Content-Type: application/json" \
  -d '{"model":"smart/auto","messages":[{"role":"user","content":"hi"}]}'

Inspect routing without dispatching

slr test "what is the capital of france"
# → SIMPLE / google/gemini-2.5-flash-lite / 100% savings vs claude-sonnet-4.6

slr test "Prove that sqrt(2) is irrational step by step"
# → REASONING / deepseek/deepseek-r1 / 90% savings

slr test "design a high-availability microservices architecture" --profile premium
# → COMPLEX / anthropic/claude-opus-4.7

slr models --profile auto    # show the tier→model table

Pointing at a different upstream

The bundled config targets OpenRouter, but anything OpenAI-compatible works (Together, Fireworks, Groq, DeepInfra, vLLM, Ollama, OpenAI direct). Copy the bundled YAML and edit api_base / api_key:

# Copy the bundled config to your working directory
python -c "from importlib.resources import files; import shutil; shutil.copy(files('smart_llm_router') / 'default_config.yaml', './smart-llm-router.yaml')"

# Edit smart-llm-router.yaml — swap api_base / api_key per model_list entry
# Then start with --config
smart-llm-router start --config smart-llm-router.yaml

Available routing profiles

model value Behavior
smart/auto Rule-based scoring → cheapest capable model
smart/eco Rule-based scoring → cheapest tier table (free + lite models)
smart/premium Rule-based scoring → quality-first tier table
smart/free Forces only free/local models
<provider>/<model> Bypasses routing, dispatches directly

Test routing without dispatching

smart-llm-router test "write a python function to compute fibonacci"
# tier: MEDIUM | model: deepseek/deepseek-chat | confidence: 0.82
# signals: code (function, python), imperative (write)

How it works

  1. Client sends OpenAI/Anthropic/Gemini-format request to localhost:4000.
  2. LiteLLM Proxy parses; SmartRouterHook.async_pre_call_hook intercepts.
  3. If model is a smart/* profile, the rule-based router scores the prompt and picks a concrete upstream model ID.
  4. LiteLLM dispatches to the configured upstream — handling format conversion, streaming, tool calls, retries, etc.

Attribution

The 14-dimension rule-based router in smart_llm_router/router/ is ported from ClawRouter (MIT). Format conversion and streaming come from LiteLLM (MIT).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smart_llm_router-0.1.1.tar.gz (29.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smart_llm_router-0.1.1-py3-none-any.whl (28.6 kB view details)

Uploaded Python 3

File details

Details for the file smart_llm_router-0.1.1.tar.gz.

File metadata

  • Download URL: smart_llm_router-0.1.1.tar.gz
  • Upload date:
  • Size: 29.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for smart_llm_router-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bfd5a570c3fd0d373fac693700562e2fc1cde207c0e2114d0563cc34e05dad46
MD5 3f9df4bbc1f85fc7b3185626ae194341
BLAKE2b-256 c844bb4234121bfc061d9d2e0d4ef023abc87b20c625bed4c07037dd0c5d4463

See more details on using hashes here.

File details

Details for the file smart_llm_router-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for smart_llm_router-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dd469349c02d3b66b92f253e41d17aea843db1afcc1f35bdf4258f508e94798f
MD5 f469dfe388d2f8736d40a73ff64928ae
BLAKE2b-256 6b7abba16f91916acfc93cf1487c77c34c78a4bcb6e66a63ee426afb990d7b97

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page