Skip to main content

Intelligent LLM router with task-aware routing, cost tracking, and observability

Project description

routeforge

Intelligent LLM routing for Python. Stop hardcoding a single model. routeforge automatically sends each prompt to the right model based on task type, complexity, and cost — with full observability logging.

pip install routeforge
from routeforge import LLMRouter

router = LLMRouter.from_yaml("config.yaml")
response = router.route("Write a FastAPI endpoint that streams SSE events")

print(response.content)
print(response.meta.model)               # gpt-4o
print(response.meta.routing_layer)       # task_classifier
print(response.meta.estimated_cost_usd)  # 0.000312

Why routeforge?

Different prompts need different models. A simple translation doesn't need GPT-4o. A multi-step reasoning problem shouldn't go to a 7B model. routeforge makes that decision automatically — saving cost without sacrificing quality.

Without routeforge With routeforge
Every prompt hits your most expensive model Simple prompts go to cheap models automatically
No visibility into cost or latency Every run logged with tokens, cost, latency
Locked to one provider OpenAI, Anthropic, OpenRouter, HuggingFace, Ollama
Manual load balancing across API keys Round-robin built in

How routing works

Every prompt passes through four layers in order:

Prompt
  │
  ▼
Layer 1 — Task classifier
  Detects task type via keyword/regex: code, reasoning, creative,
  translation, summarisation, factual. Routes to models tagged
  with that task. Picks cheap vs strong based on complexity.
  │
  ▼ (no task match)
Layer 2 — Complexity gate
  Scores prompt 0.0–1.0 using length, sentence depth, and
  pattern signals. Unambiguous scores route directly to
  cheap (<0.35) or strong (>0.65) tagged models.
  │
  ▼ (ambiguous score 0.35–0.65)
Layer 3 — Meta-router
  Sends the prompt to a cheap LLM (e.g. gpt-4o-mini) with a
  structured prompt asking it to pick the best model from your
  config. Returns JSON: model, task_type, reason.
  │
  ▼ (fallback)
Layer 4 — Default model
  Uses default_model from config. Always succeeds.

Installation

pip install routeforge

Requires Python 3.10+.


Quick start

1. Create a config file

# config.yaml
default_model: gpt-4o-mini
log_path: runs.json
complexity_threshold: 0.45

models:
  - name: gpt-4o-mini
    model_id: gpt-4o-mini
    provider: openai
    api_keys:
      - sk-your-key
    cost_per_1k_input: 0.00015
    cost_per_1k_output: 0.0006
    tags: [cheap, general, summarisation, translation]

  - name: gpt-4o
    model_id: gpt-4o
    provider: openai
    api_keys:
      - sk-your-key
    cost_per_1k_input: 0.0025
    cost_per_1k_output: 0.01
    tags: [strong, reasoning, code, creative]

2. Route prompts

from routeforge import LLMRouter

router = LLMRouter.from_yaml("config.yaml")

response = router.route("Summarise this paragraph in one sentence: ...")
print(response.meta.model)           # gpt-4o-mini  (cheap, summarisation tag)
print(response.meta.routing_layer)   # task_classifier

response = router.route("Prove that sqrt(2) is irrational")
print(response.meta.model)           # gpt-4o  (strong, reasoning tag)
print(response.meta.routing_layer)   # task_classifier

3. Or configure inline

from routeforge import LLMRouter

router = LLMRouter.from_dict({
    "default_model": "mini",
    "log_path": "runs.json",
    "models": [
        {
            "name": "mini",
            "model_id": "gpt-4o-mini",
            "provider": "openai",
            "api_keys": ["sk-your-key"],
            "cost_per_1k_input": 0.00015,
            "cost_per_1k_output": 0.0006,
            "tags": ["cheap", "general"],
        }
    ],
})

Supported providers

Provider Value Notes
OpenAI openai GPT-4o, GPT-4o-mini, o1, etc.
Anthropic anthropic Claude Sonnet, Haiku, Opus
OpenRouter openrouter 100+ models via one API key
HuggingFace huggingface Inference API, /v1/chat/completions
Ollama ollama Local models, no API key needed

OpenRouter example

- name: deepseek-r1
  model_id: deepseek/deepseek-r1
  provider: openrouter
  api_keys:
    - sk-or-your-openrouter-key
  cost_per_1k_input: 0.0008
  cost_per_1k_output: 0.0032
  tags: [strong, reasoning]

Ollama (local) example

- name: llama3
  model_id: llama3.2
  provider: ollama
  base_url: http://localhost:11434
  api_keys: []
  cost_per_1k_input: 0.0
  cost_per_1k_output: 0.0
  tags: [cheap, general]

Load balancing across API keys

Add multiple keys to any model — routeforge round-robins across them automatically:

- name: gpt-4o-mini
  model_id: gpt-4o-mini
  provider: openai
  api_keys:
    - sk-key-one
    - sk-key-two
    - sk-key-three
  tags: [cheap]

Observability

Every router.route() call is logged to a JSON file with full metadata:

for run in router.logs(last_n=5):
    print(run)
{
  "timestamp": "2026-03-28T10:42:01.123Z",
  "prompt_preview": "Write a FastAPI endpoint that streams SSE...",
  "model": "gpt-4o",
  "provider": "openai",
  "routing_layer": "task_classifier",
  "routing_reason": "Task detected as 'code'; selected by complexity (0.61)",
  "task_type": "code",
  "complexity_score": 0.61,
  "input_tokens": 48,
  "output_tokens": 312,
  "latency_ms": 1842.5,
  "estimated_cost_usd": 0.003240
}

RouteResponse reference

response = router.route("your prompt")

response.content                      # str — model's reply
response.meta.model                   # str — model alias used
response.meta.provider                # str — provider name
response.meta.routing_layer           # "task_classifier" | "complexity" | "meta_router" | "default"
response.meta.routing_reason          # str — human-readable explanation
response.meta.task_type               # "code" | "reasoning" | "creative" | "translation" | "summarisation" | "factual" | "general"
response.meta.complexity_score        # float 0.0–1.0
response.meta.input_tokens            # int
response.meta.output_tokens           # int
response.meta.latency_ms              # float
response.meta.estimated_cost_usd      # float

Config reference

Field Type Default Description
default_model str Alias of fallback model
log_path str runs.json Path to JSON log file
complexity_threshold float 0.5 Below = cheap model, above = strong
meta_router_model str cheapest tagged model Model used for meta-routing
models list List of model entries

Model entry fields:

Field Type Required Description
name str Yes Alias used in routing and logs
model_id str Yes Provider's model string
provider str Yes openai, anthropic, openrouter, huggingface, ollama
api_keys list[str] Yes One or more API keys
base_url str No Override endpoint (OpenRouter, Ollama, custom)
cost_per_1k_input float No USD per 1000 input tokens
cost_per_1k_output float No USD per 1000 output tokens
context_window int No Model context window size
tags list[str] No Used for routing: cheap, strong, code, reasoning, etc.

Task tags

Use these tags on your models to enable task-aware routing:

Tag Triggers on
code Python, functions, scripts, debug, refactor, FastAPI, SQL
reasoning Prove, derive, calculate, logic, math, equations
creative Stories, poems, blog posts, marketing copy
translation Translate, French, Spanish, German, Hindi, etc.
summarisation Summarise, TL;DR, shorten, key points
factual What is, who is, define, explain, how does
cheap Fallback for low-complexity prompts
strong Fallback for high-complexity prompts

License

MIT — built by NorthCommits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

routeforge-0.1.1.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

routeforge-0.1.1-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file routeforge-0.1.1.tar.gz.

File metadata

  • Download URL: routeforge-0.1.1.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for routeforge-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c286d00ebe437d1ec48f1cf2a1b1d4dcd7843ca6b8960585135afc9eebbac105
MD5 d8727cb28aa46caa3d656d5e48d19464
BLAKE2b-256 90fb2cba366eee80f31985a7ec0a3b47afdbf6fc802b18e53d35818ff300634f

See more details on using hashes here.

File details

Details for the file routeforge-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: routeforge-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for routeforge-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0558a9ae58dbda23e3c2d1d19414319ef18ea528be0dca60e4026da531d65e70
MD5 e5088c26d909a721cc214517fa386ad9
BLAKE2b-256 9a8603432ba65a9342e8745caef9c02e2d5ea404cd8c56540ceef0637420e282

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page