Python SDK for TokenRouter - Intelligent LLM Routing API
Project description
TokenRouter Python SDK
Official Python SDK for TokenRouter — an intelligent LLM router that provides OpenAI‑compatible endpoints and a native routing endpoint.
This README focuses on the routing interfaces you’ll use today:
- client.create(...) → Native routing endpoint (/route)
- client.chat.completions.create(...) → OpenAI chat completions (/v1/chat/completions)
- client.completions.create(...) → OpenAI legacy text completions (/v1/completions)
All calls are BYOK. Provide your TokenRouter API key, and configure provider keys in TokenRouter.
Installation
pip install tokenrouter
Quick Start (Native Route)
from tokenrouter import TokenRouter
client = TokenRouter(
api_key="tr_...",
base_url="http://localhost:8000" # or https://api.tokenrouter.io
)
response = client.create(
model="auto",
mode="balanced",
model_preferences=["gpt-4o", "gpt-4o-mini"],
messages=[
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
# Optional: select key behavior
# inline|stored|mixed|auto (default)
key_mode="auto",
)
print(response.choices[0].message.content)
Endpoints
Native Route (/route)
OpenAI‑like request/response shape plus TokenRouter metadata: cost_usd, latency_ms, routed_model, routed_provider, service_tier, etc.
Non‑streaming
response = client.create(
model="auto",
mode="balanced",
model_preferences=["gpt-4o", "gpt-4o-mini"],
messages=[
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
)
print(response.choices[0].message.content)
Streaming
for chunk in client.create(
model="auto",
stream=True,
messages=[
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": "Stream a short greeting."}
],
):
delta = (chunk.choices[0].get("delta", {}) if chunk.choices else {})
if delta.get("content"):
print(delta["content"], end="")
Chat Completions (/v1/chat/completions)
OpenAI‑compatible chat completions.
Non‑streaming
response = client.chat.completions.create(
model="auto",
mode="balanced",
model_preferences=["gpt-4o", "gpt-4o-mini"],
messages=[
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
)
print(response.choices[0].message.content)
Streaming
for chunk in client.chat.completions.create(
model="auto",
stream=True,
messages=[
{"role": "developer", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
):
delta = (chunk.choices[0].get("delta", {}) if chunk.choices else {})
if delta.get("content"):
print(delta["content"], end="")
Legacy Completions (/v1/completions)
OpenAI legacy text completion format. The SDK returns the raw OpenAI‑style dict.
Non‑streaming
resp = client.completions.create(
model="auto",
prompt="Say this is a test",
mode="balanced",
)
print(resp["choices"][0]["text"]) # text completion shape
Streaming
for chunk in client.completions.create(
model="auto",
prompt="Stream this as text",
stream=True,
):
if chunk.get("choices"):
print(chunk["choices"][0].get("text", ""), end="")
Errors
from tokenrouter import AuthenticationError, RateLimitError, InvalidRequestError, APIConnectionError
try:
response = client.chat.completions.create(
messages=[{"role": "user", "content": "Hello"}],
model="auto"
)
print(response.choices[0].message.content)
except RateLimitError as e:
print(f"Rate limited, retry after: {e.retry_after}s")
except AuthenticationError:
print("Invalid API key")
except InvalidRequestError as e:
print(f"Invalid request: {e}")
except APIConnectionError as e:
print(f"Connection error: {e}")
Environment
export TOKENROUTER_API_KEY=tr_your-api-key
# Optional
export TOKENROUTER_BASE_URL=https://api.tokenrouter.io
# Optional provider keys (auto-detected for inline encryption)
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GEMINI_API_KEY=...
export MISTRAL_API_KEY=...
export DEEPSEEK_API_KEY=...
export META_API_KEY=...
When `key_mode` is `inline`, `mixed`, or `auto`, the SDK:
- Auto-loads provider keys from your environment or local `.env` (dev/CI) with the names above
- Encrypts keys client-side using the API's published public key (fetched from `/.well-known/tr-public-key`)
- Sends the encrypted bundle in the `X-TR-Provider-Keys` header (not in JSON)
- Never persists or logs provider secrets
Using OpenAI SDK against TokenRouter
from openai import OpenAI
client = OpenAI(api_key="sk_...", base_url="https://api.tokenrouter.io/v1")
response = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Hello"}],
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenrouter-1.0.7.tar.gz.
File metadata
- Download URL: tokenrouter-1.0.7.tar.gz
- Upload date:
- Size: 12.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73476ad92ed2cdaa8c625d2970ab8ea42fbd3a3830bbdbb51056ff9aa4c3b039
|
|
| MD5 |
ef6d329d7ee0e0c0005e6f3ec162204e
|
|
| BLAKE2b-256 |
8f92af1cef01f3e68b91e0802577082ff7197696486622de7ce05fd756ae1787
|
File details
Details for the file tokenrouter-1.0.7-py3-none-any.whl.
File metadata
- Download URL: tokenrouter-1.0.7-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b74a4dcd2c6093e5e7baec993e19043269c4db651612e416616e3c9d0578d698
|
|
| MD5 |
b7c80a757499e154514f79c12c588eb9
|
|
| BLAKE2b-256 |
9f82559780507ffe75fafe4367e521d9bb52426f4b4ec08a91b6e9d4d5ef78fe
|