Intelligent LLM routing for LiveKit voice agents
Project description
ai-switchboard
Intelligent LLM routing for LiveKit voice agents.
Route conversations between N named models based on topics, heuristic signals, and custom rules. Simple turns stay on the fast model for low latency; complex, friction-heavy, or topic-sensitive turns escalate to higher-tier models automatically.
The Switchboard is a drop-in llm.LLM replacement — no extra wiring needed.
Installation
pip install livekit-ai-switchboard
Requires Python 3.10+ and
livekit-agents >= 1.0
Quick Start
from ai_switchboard import Switchboard, SwitchboardConfig
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.plugins import groq, openai
switchboard = Switchboard(
models={
"fast": groq.LLM(model="llama-3.3-70b-versatile"),
"smart": openai.LLM(model="gpt-4o"),
},
config=SwitchboardConfig(
model_topics={"smart": ["pricing", "billing"]},
),
)
agent = VoicePipelineAgent(llm=switchboard, ...)
That's it — "pricing" and "billing" messages route to the smart model, everything else stays fast.
Three Layers of Routing
Use as few or as many as you need. Each layer is additive.
1. Topics — config only, zero code
Map keywords to models. Any message containing a keyword routes to that model.
Switchboard(
models={"fast": fast_llm, "standard": std_llm, "premium": premium_llm},
config=SwitchboardConfig(
model_topics={
"premium": ["pricing", "billing", "legal"],
"standard": ["complaint", "support"],
},
),
)
2. Heuristic Escalation — config only
Auto-escalate when the conversation gets complex, frustrated, or heated. The Switchboard scores each turn using built-in signal detectors and escalates when the score exceeds your threshold.
SwitchboardConfig(
escalation_model="premium",
escalation_threshold=0.6,
)
3. Rules — opt-in, full control
Lambda conditions for custom logic — VIP routing, time-of-day, metadata checks, etc. Rules take highest priority and bypass cooldown.
from ai_switchboard import Rule
rules = [
Rule(
name="vip_customer",
condition=lambda ctx: "vip" in ctx.last_message.lower(),
use="premium",
priority=10,
),
]
How Routing Works
Every turn, the Switchboard evaluates in this order:
Rules (highest priority first, first match wins)
↓ no match
Topic keywords + Heuristic score → pick the higher-tier model
↓ nothing triggered
Default model (first in your models dict)
↓
Cooldown guard (holds current model for N turns after a switch)
↓
Forward chat() to the chosen model
Model tier is determined by insertion order — later in the dict means higher tier.
Heuristic Signals
| Signal | Category | Trigger | Weight |
|---|---|---|---|
long_input |
Complexity | Word count > 25 | 0.20 |
multi_question |
Complexity | 2+ question marks | 0.30 |
complexity_words |
Complexity | "explain", "compare", "why"... | 0.20 |
pushback |
Friction | "no", "that's wrong", "incorrect"... | 0.40 |
frustration |
Friction | "confused", "doesn't make sense"... | 0.35 |
repeat_request |
Friction | "say that again", "can you repeat"... | 0.25 |
interruption |
Conversation | User cut agent off | 0.20 |
low_stt_confidence |
Voice | STT confidence below threshold | 0.30 |
long_audio_turn |
Voice | Audio duration above threshold | 0.15 |
topic_match |
Topic | Developer-defined keyword hit | 0.50 |
Weights are summed and capped at 1.0. Default escalation threshold: 0.60.
Full Configuration
from ai_switchboard import Switchboard, SwitchboardConfig
sb = Switchboard(
models={
"fast": fast_llm,
"standard": std_llm,
"premium": premium_llm,
},
config=SwitchboardConfig(
default_model="fast", # fallback (default: first in dict)
cooldown_turns=2, # hold after switch before de-escalating
# Topic routing
model_topics={
"premium": ["pricing", "billing"],
"standard": ["complaint", "support"],
},
# Heuristic escalation
escalation_model="premium", # escalate to this when score is high
escalation_threshold=0.6, # score threshold (0.0–1.0)
# Voice thresholds
stt_confidence_threshold=0.7, # below → low_stt_confidence signal
long_audio_threshold=10.0, # seconds above → long_audio_turn signal
# Observability
on_switch=my_switch_callback, # fires only on model change
on_decision=my_decision_callback, # fires every turn
log_decisions=True, # log to "ai_switchboard" logger
),
)
Voice Signals
The Switchboard detects voice-specific signals from LiveKit's STT pipeline.
STT Confidence — low transcription confidence suggests ambiguous input that a smarter model may handle better:
SwitchboardConfig(stt_confidence_threshold=0.7) # default
Audio Duration — long audio turns often indicate complex input:
SwitchboardConfig(long_audio_threshold=10.0) # seconds, default
Feed audio duration from your pipeline:
switchboard.record_audio_duration(seconds=12.5)
Observability
Two callbacks for different use cases:
from ai_switchboard import SwitchEvent
def on_switch(event: SwitchEvent):
"""Fires only when the model changes."""
print(f"Switched {event.from_model} → {event.to_model}")
def on_decision(event: SwitchEvent):
"""Fires every turn, whether or not the model changed."""
print(f"Turn {event.turn}: model={event.to_model} score={event.heuristic_score:.2f}")
Every SwitchEvent includes: from_model, to_model, turn, heuristic_score, signals_fired, triggered_by, and changed.
Rule Context
Every rule condition receives a Context object:
| Field | Type | Description |
|---|---|---|
last_message |
str |
Raw user message text |
last_message_word_count |
int |
Word count |
turn_count |
int |
Total turns so far |
interruption_count |
int |
Interruptions this turn |
repeat_request_count |
int |
Accumulated repeat requests |
current_model |
str |
Name of current model |
turns_on_current_model |
int |
Turns since last switch |
last_switch_turn |
int |
Turn number of last switch |
stt_confidence |
float | None |
STT transcript confidence |
audio_duration |
float | None |
Audio duration in seconds |
signals_fired |
list[str] |
Signals detected this turn |
heuristic_score |
float |
Weighted score (0.0–1.0) |
API Reference
Switchboard
Switchboard(
models: dict[str, llm.LLM], # named models, insertion order = tier order
config: SwitchboardConfig = ..., # routing configuration
rules: list[Rule] = [], # custom routing rules
)
| Property / Method | Description |
|---|---|
switchboard.current_model |
Name of the active model (str) |
switchboard.model |
Underlying LLM's model identifier |
switchboard.default_model |
Name of the fallback model |
switchboard.record_interruption() |
Signal that the user interrupted |
switchboard.record_audio_duration(seconds) |
Feed audio turn length |
switchboard.reset() |
Clear all state for a new session |
Rule
Rule(
name: str, # identifier for logging
condition: Callable[[Context], bool], # evaluated each turn
use: str, # target model name
priority: int = 5, # higher = evaluated first
)
Examples
See the examples/ directory:
| File | Description |
|---|---|
basic.py |
Minimal 2-model setup with topic routing |
three_tier.py |
3-model setup with topics + heuristic escalation |
custom_rules.py |
Rules for VIP routing and custom logic |
openrouter_anthropic.py |
Groq + Anthropic via OpenRouter |
demo_agent_session.py |
LiveKit AgentSession testing pattern |
demo_events.py |
Observability with on_switch / on_decision |
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file livekit_ai_switchboard-0.0.1.tar.gz.
File metadata
- Download URL: livekit_ai_switchboard-0.0.1.tar.gz
- Upload date:
- Size: 22.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc9d4418cab8da69316edc1cbaf53b197743b45fe1462e62038dedc7c03510ab
|
|
| MD5 |
aebf36609f1db3756ee1e37e8c370d89
|
|
| BLAKE2b-256 |
c82cdc1dbbd9260c7084daba9518034aed83dcf7a564ca73146f7cb6027a552f
|
File details
Details for the file livekit_ai_switchboard-0.0.1-py3-none-any.whl.
File metadata
- Download URL: livekit_ai_switchboard-0.0.1-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91ebe0f48a31e272512bf463e7239e624c099b3e47d303903d55a502d8090319
|
|
| MD5 |
8fe9a30349f8833b7e5e9020a2ff399b
|
|
| BLAKE2b-256 |
1cf782e7c9bb832a406c99d061a805790b6eb0e459cfe40fdfef6b970e2ab959
|