Intelligent LLM router with task-aware routing, cost tracking, and observability

These details have not been verified by PyPI

Project links

Project description

routeforge

Intelligent LLM routing for Python. Stop hardcoding a single model. routeforge automatically sends each prompt to the right model based on task type, complexity, and cost — with full observability logging.

pip install routeforge

from routeforge import LLMRouter

router = LLMRouter.from_yaml("config.yaml")
response = router.route("Write a FastAPI endpoint that streams SSE events")

print(response.content)
print(response.meta.model)               # gpt-4o
print(response.meta.routing_layer)       # task_classifier
print(response.meta.estimated_cost_usd)  # 0.000312

Why routeforge?

Different prompts need different models. A simple translation doesn't need GPT-4o. A multi-step reasoning problem shouldn't go to a 7B model. routeforge makes that decision automatically — saving cost without sacrificing quality.

Without routeforge	With routeforge
Every prompt hits your most expensive model	Simple prompts go to cheap models automatically
No visibility into cost or latency	Every run logged with tokens, cost, latency
Locked to one provider	OpenAI, Anthropic, OpenRouter, HuggingFace, Ollama
Manual load balancing across API keys	Round-robin built in

How routing works

Every prompt passes through four layers in order:

Prompt
  │
  ▼
Layer 1 — Task classifier
  Detects task type via keyword/regex: code, reasoning, creative,
  translation, summarisation, factual. Routes to models tagged
  with that task. Picks cheap vs strong based on complexity.
  │
  ▼ (no task match)
Layer 2 — Complexity gate
  Scores prompt 0.0–1.0 using length, sentence depth, and
  pattern signals. Unambiguous scores route directly to
  cheap (<0.35) or strong (>0.65) tagged models.
  │
  ▼ (ambiguous score 0.35–0.65)
Layer 3 — Meta-router
  Sends the prompt to a cheap LLM (e.g. gpt-4o-mini) with a
  structured prompt asking it to pick the best model from your
  config. Returns JSON: model, task_type, reason.
  │
  ▼ (fallback)
Layer 4 — Default model
  Uses default_model from config. Always succeeds.

Installation

pip install routeforge

Requires Python 3.10+.

Quick start

1. Create a config file

# config.yaml
default_model: gpt-4o-mini
log_path: runs.json
complexity_threshold: 0.45

models:
  - name: gpt-4o-mini
    model_id: gpt-4o-mini
    provider: openai
    api_keys:
      - sk-your-key
    cost_per_1k_input: 0.00015
    cost_per_1k_output: 0.0006
    tags: [cheap, general, summarisation, translation]

  - name: gpt-4o
    model_id: gpt-4o
    provider: openai
    api_keys:
      - sk-your-key
    cost_per_1k_input: 0.0025
    cost_per_1k_output: 0.01
    tags: [strong, reasoning, code, creative]

2. Route prompts

from routeforge import LLMRouter

router = LLMRouter.from_yaml("config.yaml")

response = router.route("Summarise this paragraph in one sentence: ...")
print(response.meta.model)           # gpt-4o-mini  (cheap, summarisation tag)
print(response.meta.routing_layer)   # task_classifier

response = router.route("Prove that sqrt(2) is irrational")
print(response.meta.model)           # gpt-4o  (strong, reasoning tag)
print(response.meta.routing_layer)   # task_classifier

3. Or configure inline

from routeforge import LLMRouter

router = LLMRouter.from_dict({
    "default_model": "mini",
    "log_path": "runs.json",
    "models": [
        {
            "name": "mini",
            "model_id": "gpt-4o-mini",
            "provider": "openai",
            "api_keys": ["sk-your-key"],
            "cost_per_1k_input": 0.00015,
            "cost_per_1k_output": 0.0006,
            "tags": ["cheap", "general"],
        }
    ],
})

Supported providers

Provider	Value	Notes
OpenAI	`openai`	GPT-4o, GPT-4o-mini, o1, etc.
Anthropic	`anthropic`	Claude Sonnet, Haiku, Opus
OpenRouter	`openrouter`	100+ models via one API key
HuggingFace	`huggingface`	Inference API, `/v1/chat/completions`
Ollama	`ollama`	Local models, no API key needed

OpenRouter example

- name: deepseek-r1
  model_id: deepseek/deepseek-r1
  provider: openrouter
  api_keys:
    - sk-or-your-openrouter-key
  cost_per_1k_input: 0.0008
  cost_per_1k_output: 0.0032
  tags: [strong, reasoning]

Ollama (local) example

- name: llama3
  model_id: llama3.2
  provider: ollama
  base_url: http://localhost:11434
  api_keys: []
  cost_per_1k_input: 0.0
  cost_per_1k_output: 0.0
  tags: [cheap, general]

Load balancing across API keys

Add multiple keys to any model — routeforge round-robins across them automatically:

- name: gpt-4o-mini
  model_id: gpt-4o-mini
  provider: openai
  api_keys:
    - sk-key-one
    - sk-key-two
    - sk-key-three
  tags: [cheap]

Observability

Every router.route() call is logged to a JSON file with full metadata:

for run in router.logs(last_n=5):
    print(run)

{
  "timestamp": "2026-03-28T10:42:01.123Z",
  "prompt_preview": "Write a FastAPI endpoint that streams SSE...",
  "model": "gpt-4o",
  "provider": "openai",
  "routing_layer": "task_classifier",
  "routing_reason": "Task detected as 'code'; selected by complexity (0.61)",
  "task_type": "code",
  "complexity_score": 0.61,
  "input_tokens": 48,
  "output_tokens": 312,
  "latency_ms": 1842.5,
  "estimated_cost_usd": 0.003240
}

RouteResponse reference

response = router.route("your prompt")

response.content                      # str — model's reply
response.meta.model                   # str — model alias used
response.meta.provider                # str — provider name
response.meta.routing_layer           # "task_classifier" | "complexity" | "meta_router" | "default"
response.meta.routing_reason          # str — human-readable explanation
response.meta.task_type               # "code" | "reasoning" | "creative" | "translation" | "summarisation" | "factual" | "general"
response.meta.complexity_score        # float 0.0–1.0
response.meta.input_tokens            # int
response.meta.output_tokens           # int
response.meta.latency_ms              # float
response.meta.estimated_cost_usd      # float

Config reference

Field	Type	Default	Description
`default_model`	str	—	Alias of fallback model
`log_path`	str	`runs.json`	Path to JSON log file
`complexity_threshold`	float	`0.5`	Below = cheap model, above = strong
`meta_router_model`	str	cheapest tagged model	Model used for meta-routing
`models`	list	—	List of model entries

Model entry fields:

Field	Type	Required	Description
`name`	str	Yes	Alias used in routing and logs
`model_id`	str	Yes	Provider's model string
`provider`	str	Yes	`openai`, `anthropic`, `openrouter`, `huggingface`, `ollama`
`api_keys`	list[str]	Yes	One or more API keys
`base_url`	str	No	Override endpoint (OpenRouter, Ollama, custom)
`cost_per_1k_input`	float	No	USD per 1000 input tokens
`cost_per_1k_output`	float	No	USD per 1000 output tokens
`context_window`	int	No	Model context window size
`tags`	list[str]	No	Used for routing: `cheap`, `strong`, `code`, `reasoning`, etc.

Task tags

Use these tags on your models to enable task-aware routing:

Tag	Triggers on
`code`	Python, functions, scripts, debug, refactor, FastAPI, SQL
`reasoning`	Prove, derive, calculate, logic, math, equations
`creative`	Stories, poems, blog posts, marketing copy
`translation`	Translate, French, Spanish, German, Hindi, etc.
`summarisation`	Summarise, TL;DR, shorten, key points
`factual`	What is, who is, define, explain, how does
`cheap`	Fallback for low-complexity prompts
`strong`	Fallback for high-complexity prompts

License

MIT — built by NorthCommits

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Mar 28, 2026

0.1.0

Mar 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

routeforge-0.1.1.tar.gz (13.0 kB view details)

Uploaded Mar 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

routeforge-0.1.1-py3-none-any.whl (13.9 kB view details)

Uploaded Mar 28, 2026 Python 3

File details

Details for the file routeforge-0.1.1.tar.gz.

File metadata

Download URL: routeforge-0.1.1.tar.gz
Upload date: Mar 28, 2026
Size: 13.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for routeforge-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`c286d00ebe437d1ec48f1cf2a1b1d4dcd7843ca6b8960585135afc9eebbac105`
MD5	`d8727cb28aa46caa3d656d5e48d19464`
BLAKE2b-256	`90fb2cba366eee80f31985a7ec0a3b47afdbf6fc802b18e53d35818ff300634f`

See more details on using hashes here.

File details

Details for the file routeforge-0.1.1-py3-none-any.whl.

File metadata

Download URL: routeforge-0.1.1-py3-none-any.whl
Upload date: Mar 28, 2026
Size: 13.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for routeforge-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0558a9ae58dbda23e3c2d1d19414319ef18ea528be0dca60e4026da531d65e70`
MD5	`e5088c26d909a721cc214517fa386ad9`
BLAKE2b-256	`9a8603432ba65a9342e8745caef9c02e2d5ea404cd8c56540ceef0637420e282`

See more details on using hashes here.

routeforge 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

routeforge

Why routeforge?

How routing works

Installation

Quick start

1. Create a config file

2. Route prompts

3. Or configure inline

Supported providers

OpenRouter example

Ollama (local) example

Load balancing across API keys

Observability

RouteResponse reference

Config reference

Task tags

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes