Lightweight AI model intelligent routing SDK — BYOK, auto-select optimal model based on prompt content

These details have not been verified by PyPI

Project links

Project description

TokenRouter

Lightweight AI model intelligent routing SDK — BYOK (Bring Your Own Key), auto-select the optimal model based on prompt content. Local deployment, zero data leakage.

TokenRouter analyzes your prompt, classifies the task type (coding, math, translation, etc.), estimates complexity, and routes to the best model using a capability matrix and your preferred strategy.

Features

Smart Routing — Automatically selects the optimal model based on task type, complexity, and strategy
BYOK — Use your own API keys. No middleman, no data leakage
Multi-Provider — OpenAI, Anthropic, Google, DeepSeek, Moonshot, Qwen, Zhipu
3 Strategies — cheapest, best, balanced (quality/cost ratio)
Fallback Chain — Automatic retry with fallback models on failure (429, 500, 502, 503, 504)
Two-Tier Classifier — L1 (regex heuristic, instant) + L2 (cheap model, when L1 confidence < 0.7)
OpenAI-Compatible Proxy — Drop-in replacement for any OpenAI SDK
Async-First — Built on httpx async, with sync wrappers for convenience
Zero Core Dependencies — Core routing uses only stdlib; httpx for API calls; FastAPI optional for proxy
100% Type Hints — Full type annotations throughout

Installation

# Core (routing + providers)
pip install tokenrouter

# With proxy server
pip install tokenrouter[proxy]

# With YAML config support
pip install tokenrouter[yaml]

# Everything
pip install tokenrouter[all]

Quick Start

1. Python SDK

from tokenrouter import TokenRouter

router = TokenRouter(
    keys={
        "openai": "sk-...",
        "anthropic": "sk-ant-...",
        "deepseek": "sk-...",
    },
    strategy="balanced",  # cheapest | best | balanced
)

# Auto-route — TokenRouter picks the best model
response = router.chat([
    {"role": "user", "content": "Write a Python quicksort function"}
])
print(response.model_used)     # e.g. "deepseek-chat"
print(response.choices[0].message["content"])

# Streaming
for chunk in router.chat_stream([
    {"role": "user", "content": "Explain quantum computing"}
]):
    delta = chunk.choices[0].get("delta", {})
    print(delta.get("content", ""), end="")

# Classify only (no API call)
result = router.classify([
    {"role": "user", "content": "Translate to French: Hello"}
])
print(result.task_type)       # "translation"
print(result.selected_model)  # ModelConfig for the optimal model

2. Async API

import asyncio
from tokenrouter import TokenRouter

router = TokenRouter(keys={"openai": "sk-..."})

async def main():
    response = await router.achat([
        {"role": "user", "content": "What is 2+2?"}
    ])
    print(response.model_used)

    async for chunk in router.achat_stream([
        {"role": "user", "content": "Write a haiku"}
    ]):
        delta = chunk.choices[0].get("delta", {})
        print(delta.get("content", ""), end="")

asyncio.run(main())

3. YAML Config

# config.yaml
keys:
  openai: ${OPENAI_API_KEY}
  anthropic: ${ANTHROPIC_API_KEY}
  deepseek: ${DEEPSEEK_API_KEY}

strategy: balanced

rules:
  - task: coding
    model: deepseek-chat
  - task: chinese_language
    model: qwen-max

exclude_models:
  - claude-opus-4

router = TokenRouter.from_config("config.yaml")

4. OpenAI-Compatible Proxy

Start the proxy server:

tokenrouter serve --config config.yaml --port 8000

Then use with any OpenAI SDK:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1", api_key="any")
response = client.chat.completions.create(
    model="auto",  # TokenRouter routes automatically
    messages=[{"role": "user", "content": "Write a sort function"}],
)
print(response.model)  # actual model used

5. CLI

# Classify a prompt
tokenrouter classify "Write a Python function to parse JSON"
# Output: {"task_type": "coding", "complexity": "low", "model": "deepseek-chat", ...}

tokenrouter classify "请帮我翻译这段话"
# Output: {"task_type": "chinese_language", ...}

Supported Models

Provider	Models	Best For
OpenAI	GPT-5.2, GPT-5 Mini	General, coding, reasoning
Anthropic	Claude Opus 4/4.5, Sonnet 4, Haiku 4.5	Coding, creative writing, reasoning
Google	Gemini 3 Flash, 2.5 Flash/Pro	Quick tasks, summarization
DeepSeek	DeepSeek V3.2, R1	Coding, math, reasoning (cheapest)
Moonshot	Kimi K2.5	Chinese, coding
Qwen	Turbo, Plus, Max	Chinese language tasks
Zhipu	GLM-4 Plus	Chinese, general QA

How Routing Works

L1 Classifier — Regex-based heuristic analyzes the prompt for task patterns (coding keywords, math symbols, translation phrases, etc.) and scores 8 task categories
L2 Classifier — If L1 confidence < 0.7, a cheap model (GPT-5 Mini / Gemini Flash) refines the classification
Routing Table — Maps task_type:complexity to candidate models (24 combinations)
Strategy Selection — Picks from candidates using the capability matrix:
- cheapest: Lowest cost per token
- best: Highest capability score, then cheapest
- balanced: Best quality/cost ratio
Fallback Chain — If the primary model fails (429/5xx), automatically retries with the next candidate

Custom Rules

Override automatic routing for specific task types:

from tokenrouter import TokenRouter
from tokenrouter.types import CustomRule

router = TokenRouter(
    keys={"openai": "sk-...", "deepseek": "sk-..."},
    strategy="balanced",
    rules=[
        CustomRule(task="coding", model="deepseek-chat"),
        CustomRule(task="chinese_language", model="qwen-max"),
    ],
)

Configuration Reference

`TokenRouter` Parameters

Parameter	Type	Default	Description
`keys`	`dict[str, str]`	`{}`	Provider API keys
`strategy`	`str`	`"balanced"`	`cheapest`, `best`, or `balanced`
`rules`	`list[CustomRule]`	`[]`	Custom routing overrides
`exclude_models`	`list[str]`	`[]`	Models to never use

Proxy Headers

Header	Description
`X-Routing-Strategy`	Override strategy per request
`X-TokenRouter-Model`	Model actually used (response)
`X-TokenRouter-Task`	Detected task type (response)
`X-TokenRouter-Complexity`	Detected complexity (response)
`X-TokenRouter-Cost`	Estimated cost in USD (response)

Development

git clone https://github.com/tokenrouter/tokenrouter.git
cd tokenrouter
pip install -e ".[dev]"
pytest

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Mar 17, 2026

0.2.0

Mar 16, 2026

This version

0.1.0

Mar 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

byok_router-0.1.0.tar.gz (30.9 kB view details)

Uploaded Mar 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

byok_router-0.1.0-py3-none-any.whl (34.4 kB view details)

Uploaded Mar 16, 2026 Python 3

File details

Details for the file byok_router-0.1.0.tar.gz.

File metadata

Download URL: byok_router-0.1.0.tar.gz
Upload date: Mar 16, 2026
Size: 30.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for byok_router-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`72abe5feca86d6792f8678dea7f14357a2ac99d38962a0823c42b0c42a188b8e`
MD5	`1dc5d7a1800d606d7b6757c6982fbbdd`
BLAKE2b-256	`cf9af7f913cfdfaede193a7a0275516014fc824c52424b4fae188ea0462b034c`

See more details on using hashes here.

File details

Details for the file byok_router-0.1.0-py3-none-any.whl.

File metadata

Download URL: byok_router-0.1.0-py3-none-any.whl
Upload date: Mar 16, 2026
Size: 34.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for byok_router-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aeee790eb9e93d71b19d0592e61ecf13eabac7bfed4a6b2975a90fd1b2545826`
MD5	`971c84720088bc9a0dfd2234fb761c27`
BLAKE2b-256	`46d7a79e33c7bbdb148baa65d88fe2d5a447d43cc73ca33cf9bb3023ff2c39e3`

See more details on using hashes here.

byok-router 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TokenRouter

Features

Installation

Quick Start

1. Python SDK

2. Async API

3. YAML Config

4. OpenAI-Compatible Proxy

5. CLI

Supported Models

How Routing Works

Custom Rules

Configuration Reference

TokenRouter Parameters

Proxy Headers

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`TokenRouter` Parameters