Intelligent LLM router with task-aware routing, cost tracking, and observability
Project description
routeforge
Intelligent LLM routing for Python. Stop hardcoding a single model. routeforge automatically sends each prompt to the right model based on task type, complexity, and cost — with full observability logging.
pip install routeforge
from routeforge import LLMRouter
router = LLMRouter.from_yaml("config.yaml")
response = router.route("Write a FastAPI endpoint that streams SSE events")
print(response.content)
print(response.meta.model) # gpt-4o
print(response.meta.routing_layer) # task_classifier
print(response.meta.estimated_cost_usd) # 0.000312
Why routeforge?
Different prompts need different models. A simple translation doesn't need GPT-4o. A multi-step reasoning problem shouldn't go to a 7B model. routeforge makes that decision automatically — saving cost without sacrificing quality.
| Without routeforge | With routeforge |
|---|---|
| Every prompt hits your most expensive model | Simple prompts go to cheap models automatically |
| No visibility into cost or latency | Every run logged with tokens, cost, latency |
| Locked to one provider | OpenAI, Anthropic, OpenRouter, HuggingFace, Ollama |
| Manual load balancing across API keys | Round-robin built in |
How routing works
Every prompt passes through four layers in order:
Prompt
│
▼
Layer 1 — Task classifier
Detects task type via keyword/regex: code, reasoning, creative,
translation, summarisation, factual. Routes to models tagged
with that task. Picks cheap vs strong based on complexity.
│
▼ (no task match)
Layer 2 — Complexity gate
Scores prompt 0.0–1.0 using length, sentence depth, and
pattern signals. Unambiguous scores route directly to
cheap (<0.35) or strong (>0.65) tagged models.
│
▼ (ambiguous score 0.35–0.65)
Layer 3 — Meta-router
Sends the prompt to a cheap LLM (e.g. gpt-4o-mini) with a
structured prompt asking it to pick the best model from your
config. Returns JSON: model, task_type, reason.
│
▼ (fallback)
Layer 4 — Default model
Uses default_model from config. Always succeeds.
Installation
pip install routeforge
Requires Python 3.10+.
Quick start
1. Create a config file
# config.yaml
default_model: gpt-4o-mini
log_path: runs.json
complexity_threshold: 0.45
models:
- name: gpt-4o-mini
model_id: gpt-4o-mini
provider: openai
api_keys:
- sk-your-key
cost_per_1k_input: 0.00015
cost_per_1k_output: 0.0006
tags: [cheap, general, summarisation, translation]
- name: gpt-4o
model_id: gpt-4o
provider: openai
api_keys:
- sk-your-key
cost_per_1k_input: 0.0025
cost_per_1k_output: 0.01
tags: [strong, reasoning, code, creative]
2. Route prompts
from routeforge import LLMRouter
router = LLMRouter.from_yaml("config.yaml")
response = router.route("Summarise this paragraph in one sentence: ...")
print(response.meta.model) # gpt-4o-mini (cheap, summarisation tag)
print(response.meta.routing_layer) # task_classifier
response = router.route("Prove that sqrt(2) is irrational")
print(response.meta.model) # gpt-4o (strong, reasoning tag)
print(response.meta.routing_layer) # task_classifier
3. Or configure inline
from routeforge import LLMRouter
router = LLMRouter.from_dict({
"default_model": "mini",
"log_path": "runs.json",
"models": [
{
"name": "mini",
"model_id": "gpt-4o-mini",
"provider": "openai",
"api_keys": ["sk-your-key"],
"cost_per_1k_input": 0.00015,
"cost_per_1k_output": 0.0006,
"tags": ["cheap", "general"],
}
],
})
Supported providers
| Provider | Value | Notes |
|---|---|---|
| OpenAI | openai |
GPT-4o, GPT-4o-mini, o1, etc. |
| Anthropic | anthropic |
Claude Sonnet, Haiku, Opus |
| OpenRouter | openrouter |
100+ models via one API key |
| HuggingFace | huggingface |
Inference API, /v1/chat/completions |
| Ollama | ollama |
Local models, no API key needed |
OpenRouter example
- name: deepseek-r1
model_id: deepseek/deepseek-r1
provider: openrouter
api_keys:
- sk-or-your-openrouter-key
cost_per_1k_input: 0.0008
cost_per_1k_output: 0.0032
tags: [strong, reasoning]
Ollama (local) example
- name: llama3
model_id: llama3.2
provider: ollama
base_url: http://localhost:11434
api_keys: []
cost_per_1k_input: 0.0
cost_per_1k_output: 0.0
tags: [cheap, general]
Load balancing across API keys
Add multiple keys to any model — routeforge round-robins across them automatically:
- name: gpt-4o-mini
model_id: gpt-4o-mini
provider: openai
api_keys:
- sk-key-one
- sk-key-two
- sk-key-three
tags: [cheap]
Observability
Every router.route() call is logged to a JSON file with full metadata:
for run in router.logs(last_n=5):
print(run)
{
"timestamp": "2026-03-28T10:42:01.123Z",
"prompt_preview": "Write a FastAPI endpoint that streams SSE...",
"model": "gpt-4o",
"provider": "openai",
"routing_layer": "task_classifier",
"routing_reason": "Task detected as 'code'; selected by complexity (0.61)",
"task_type": "code",
"complexity_score": 0.61,
"input_tokens": 48,
"output_tokens": 312,
"latency_ms": 1842.5,
"estimated_cost_usd": 0.003240
}
RouteResponse reference
response = router.route("your prompt")
response.content # str — model's reply
response.meta.model # str — model alias used
response.meta.provider # str — provider name
response.meta.routing_layer # "task_classifier" | "complexity" | "meta_router" | "default"
response.meta.routing_reason # str — human-readable explanation
response.meta.task_type # "code" | "reasoning" | "creative" | "translation" | "summarisation" | "factual" | "general"
response.meta.complexity_score # float 0.0–1.0
response.meta.input_tokens # int
response.meta.output_tokens # int
response.meta.latency_ms # float
response.meta.estimated_cost_usd # float
Config reference
| Field | Type | Default | Description |
|---|---|---|---|
default_model |
str | — | Alias of fallback model |
log_path |
str | runs.json |
Path to JSON log file |
complexity_threshold |
float | 0.5 |
Below = cheap model, above = strong |
meta_router_model |
str | cheapest tagged model | Model used for meta-routing |
models |
list | — | List of model entries |
Model entry fields:
| Field | Type | Required | Description |
|---|---|---|---|
name |
str | Yes | Alias used in routing and logs |
model_id |
str | Yes | Provider's model string |
provider |
str | Yes | openai, anthropic, openrouter, huggingface, ollama |
api_keys |
list[str] | Yes | One or more API keys |
base_url |
str | No | Override endpoint (OpenRouter, Ollama, custom) |
cost_per_1k_input |
float | No | USD per 1000 input tokens |
cost_per_1k_output |
float | No | USD per 1000 output tokens |
context_window |
int | No | Model context window size |
tags |
list[str] | No | Used for routing: cheap, strong, code, reasoning, etc. |
Task tags
Use these tags on your models to enable task-aware routing:
| Tag | Triggers on |
|---|---|
code |
Python, functions, scripts, debug, refactor, FastAPI, SQL |
reasoning |
Prove, derive, calculate, logic, math, equations |
creative |
Stories, poems, blog posts, marketing copy |
translation |
Translate, French, Spanish, German, Hindi, etc. |
summarisation |
Summarise, TL;DR, shorten, key points |
factual |
What is, who is, define, explain, how does |
cheap |
Fallback for low-complexity prompts |
strong |
Fallback for high-complexity prompts |
License
MIT — built by NorthCommits
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file routeforge-0.1.1.tar.gz.
File metadata
- Download URL: routeforge-0.1.1.tar.gz
- Upload date:
- Size: 13.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c286d00ebe437d1ec48f1cf2a1b1d4dcd7843ca6b8960585135afc9eebbac105
|
|
| MD5 |
d8727cb28aa46caa3d656d5e48d19464
|
|
| BLAKE2b-256 |
90fb2cba366eee80f31985a7ec0a3b47afdbf6fc802b18e53d35818ff300634f
|
File details
Details for the file routeforge-0.1.1-py3-none-any.whl.
File metadata
- Download URL: routeforge-0.1.1-py3-none-any.whl
- Upload date:
- Size: 13.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0558a9ae58dbda23e3c2d1d19414319ef18ea528be0dca60e4026da531d65e70
|
|
| MD5 |
e5088c26d909a721cc214517fa386ad9
|
|
| BLAKE2b-256 |
9a8603432ba65a9342e8745caef9c02e2d5ea404cd8c56540ceef0637420e282
|