Route prompts to the right LLM automatically. Supports Groq, OpenAI, Anthropic.
Project description
llm-router
Route prompts to the right LLM automatically. Supports Groq, OpenAI, and Anthropic.
Stop hardcoding a single model. llm-router analyses your prompt's complexity and routes it to the cheapest model that can handle it — saving cost without sacrificing quality.
Simple prompts → fast cheap models. Complex prompts → powerful models. Automatically.
Install
pip install llm-dispatch
# Install provider SDKs you need:
pip install llm-dispatch[groq] # Groq only
pip install llm-dispatch[openai] # OpenAI only
pip install llm-dispatch[anthropic] # Anthropic only
pip install llm-dispatch[all] # All providers
Quick Start
from llmrouter import LLMRouter
router = (
LLMRouter(verbose=True)
.add("groq/llama-3.1-8b-instant") # fast, cheap
.add("groq/llama-3.3-70b-versatile") # powerful
.add("openai/gpt-4o") # best quality
)
# Simple prompt → routed to cheap fast model
result = router.complete("What is the capital of France?")
print(result.output) # Paris
print(result.model_used) # llama-3.1-8b-instant
print(result.estimated_cost_usd) # 0.000001
# Complex prompt → routed to powerful model
result = router.complete("""
Analyze the architectural trade-offs between microservices and
monolithic systems for a high-traffic fintech application.
Consider scalability, fault tolerance, and deployment complexity.
""")
print(result.model_used) # gpt-4o
Strategies
# Auto (default) — complexity-aware routing
router = LLMRouter(strategy="auto")
# Always cheapest model that fits the context window
router = LLMRouter(strategy="cheapest")
# Always fastest model
router = LLMRouter(strategy="fastest")
# Always highest quality model
router = LLMRouter(strategy="smartest")
# Balance speed and quality
router = LLMRouter(strategy="balanced")
# Override per call
result = router.complete(prompt, strategy="cheapest")
How auto-routing works
The router scores each prompt from 0.0 (trivial) to 1.0 (very complex):
| Complexity | Score | Routed to |
|---|---|---|
| Simple question | < 0.25 | Cheapest fast model |
| Medium task | 0.25–0.60 | Best quality/cost ratio |
| Complex analysis | > 0.60 | Highest quality model |
Scoring factors: prompt length, technical keywords, code blocks, number of questions, multi-part instructions.
from llmrouter import complexity_score
complexity_score("What is 2+2?") # 0.0
complexity_score("Summarize this 500-word article") # 0.35
complexity_score("Refactor this Python class and add type hints and unit tests") # 0.72
Available Models
from llmrouter import PRESET_MODELS
print(list(PRESET_MODELS.keys()))
| Model ID | Provider | Speed | Quality | Cost/1k tokens |
|---|---|---|---|---|
groq/llama-3.1-8b-instant |
Groq | fast | 4/10 | $0.00005 |
groq/llama-3.3-70b-versatile |
Groq | medium | 8/10 | $0.00059 |
groq/mixtral-8x7b-32768 |
Groq | medium | 7/10 | $0.00024 |
openai/gpt-4o-mini |
OpenAI | fast | 6/10 | $0.00015 |
openai/gpt-4o |
OpenAI | medium | 9/10 | $0.005 |
anthropic/claude-haiku-3 |
Anthropic | fast | 6/10 | $0.00025 |
anthropic/claude-sonnet-4 |
Anthropic | medium | 9/10 | $0.003 |
Custom Models
from llmrouter import LLMRouter, ModelConfig
router = LLMRouter()
router.add_custom(ModelConfig(
model_id="my-provider/my-model",
provider="groq", # uses Groq's SDK
name="my-model-name",
cost_per_1k=0.0001,
context_window=16384,
speed="fast",
quality=5,
))
RouteResult
Every .complete() call returns a RouteResult:
result.output # str — model response
result.model_used # str — model name selected
result.provider # str — 'groq' | 'openai' | 'anthropic'
result.complexity_score # float 0.0–1.0
result.strategy # str — strategy used
result.estimated_cost_usd # float — estimated cost in USD
Environment Variables
GROQ_API_KEY=gsk_...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
Or pass per model:
router.add("groq/llama-3.3-70b-versatile", api_key="gsk_...")
License
MIT © 2025 M Adhitya
Built at Rewrite Labs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llm_dispatch-1.0.0.tar.gz.
File metadata
- Download URL: llm_dispatch-1.0.0.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aee283ceead1fe00b3f0bfdaa6a67216579a0caf5d2f4f7d164be1a164f73207
|
|
| MD5 |
0967649f4865249617ead039b61c9a80
|
|
| BLAKE2b-256 |
0cd361d2e331d259edce3f76bc32b863192a66e75c512c363f8ae1e25e00e75f
|
File details
Details for the file llm_dispatch-1.0.0-py3-none-any.whl.
File metadata
- Download URL: llm_dispatch-1.0.0-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a32c19456bd0fb6179f5e3c50923c5dfcb52cc0330cb6478c441815e0296a5f3
|
|
| MD5 |
0f059c7ea43b01be265837b890f2e4eb
|
|
| BLAKE2b-256 |
3e391ebf01ff180f473e73873a557b6f956b83b38854c260cbcf9c3127dde647
|