Skip to main content

Route prompts to the right LLM automatically. Supports Groq, OpenAI, Anthropic.

Project description

llm-router

MIT License GitHub Python

Route prompts to the right LLM automatically. Supports Groq, OpenAI, and Anthropic.

Stop hardcoding a single model. llm-router analyses your prompt's complexity and routes it to the cheapest model that can handle it — saving cost without sacrificing quality.

Simple prompts → fast cheap models. Complex prompts → powerful models. Automatically.


Install

pip install llm-dispatch

# Install provider SDKs you need:
pip install llm-dispatch[groq]       # Groq only
pip install llm-dispatch[openai]     # OpenAI only
pip install llm-dispatch[anthropic]  # Anthropic only
pip install llm-dispatch[all]        # All providers

Quick Start

from llmrouter import LLMRouter

router = (
    LLMRouter(verbose=True)
    .add("groq/llama-3.1-8b-instant")    # fast, cheap
    .add("groq/llama-3.3-70b-versatile") # powerful
    .add("openai/gpt-4o")                # best quality
)

# Simple prompt → routed to cheap fast model
result = router.complete("What is the capital of France?")
print(result.output)      # Paris
print(result.model_used)  # llama-3.1-8b-instant
print(result.estimated_cost_usd)  # 0.000001

# Complex prompt → routed to powerful model
result = router.complete("""
    Analyze the architectural trade-offs between microservices and
    monolithic systems for a high-traffic fintech application.
    Consider scalability, fault tolerance, and deployment complexity.
""")
print(result.model_used)  # gpt-4o

Strategies

# Auto (default) — complexity-aware routing
router = LLMRouter(strategy="auto")

# Always cheapest model that fits the context window
router = LLMRouter(strategy="cheapest")

# Always fastest model
router = LLMRouter(strategy="fastest")

# Always highest quality model
router = LLMRouter(strategy="smartest")

# Balance speed and quality
router = LLMRouter(strategy="balanced")

# Override per call
result = router.complete(prompt, strategy="cheapest")

How auto-routing works

The router scores each prompt from 0.0 (trivial) to 1.0 (very complex):

Complexity Score Routed to
Simple question < 0.25 Cheapest fast model
Medium task 0.25–0.60 Best quality/cost ratio
Complex analysis > 0.60 Highest quality model

Scoring factors: prompt length, technical keywords, code blocks, number of questions, multi-part instructions.

from llmrouter import complexity_score

complexity_score("What is 2+2?")                         # 0.0
complexity_score("Summarize this 500-word article")      # 0.35
complexity_score("Refactor this Python class and add type hints and unit tests") # 0.72

Available Models

from llmrouter import PRESET_MODELS
print(list(PRESET_MODELS.keys()))
Model ID Provider Speed Quality Cost/1k tokens
groq/llama-3.1-8b-instant Groq fast 4/10 $0.00005
groq/llama-3.3-70b-versatile Groq medium 8/10 $0.00059
groq/mixtral-8x7b-32768 Groq medium 7/10 $0.00024
openai/gpt-4o-mini OpenAI fast 6/10 $0.00015
openai/gpt-4o OpenAI medium 9/10 $0.005
anthropic/claude-haiku-3 Anthropic fast 6/10 $0.00025
anthropic/claude-sonnet-4 Anthropic medium 9/10 $0.003

Custom Models

from llmrouter import LLMRouter, ModelConfig

router = LLMRouter()
router.add_custom(ModelConfig(
    model_id="my-provider/my-model",
    provider="groq",           # uses Groq's SDK
    name="my-model-name",
    cost_per_1k=0.0001,
    context_window=16384,
    speed="fast",
    quality=5,
))

RouteResult

Every .complete() call returns a RouteResult:

result.output               # str — model response
result.model_used           # str — model name selected
result.provider             # str — 'groq' | 'openai' | 'anthropic'
result.complexity_score     # float 0.0–1.0
result.strategy             # str — strategy used
result.estimated_cost_usd   # float — estimated cost in USD

Environment Variables

GROQ_API_KEY=gsk_...
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...

Or pass per model:

router.add("groq/llama-3.3-70b-versatile", api_key="gsk_...")

License

MIT © 2025 M Adhitya

Built at Rewrite Labs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_dispatch-1.0.0.tar.gz (9.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_dispatch-1.0.0-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file llm_dispatch-1.0.0.tar.gz.

File metadata

  • Download URL: llm_dispatch-1.0.0.tar.gz
  • Upload date:
  • Size: 9.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for llm_dispatch-1.0.0.tar.gz
Algorithm Hash digest
SHA256 aee283ceead1fe00b3f0bfdaa6a67216579a0caf5d2f4f7d164be1a164f73207
MD5 0967649f4865249617ead039b61c9a80
BLAKE2b-256 0cd361d2e331d259edce3f76bc32b863192a66e75c512c363f8ae1e25e00e75f

See more details on using hashes here.

File details

Details for the file llm_dispatch-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: llm_dispatch-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for llm_dispatch-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a32c19456bd0fb6179f5e3c50923c5dfcb52cc0330cb6478c441815e0296a5f3
MD5 0f059c7ea43b01be265837b890f2e4eb
BLAKE2b-256 3e391ebf01ff180f473e73873a557b6f956b83b38854c260cbcf9c3127dde647

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page