Skip to main content

Intelligent LLM routing for LiveKit voice agents

Project description

ai-switchboard

Intelligent LLM routing for LiveKit voice agents.

PyPI Python License


Route conversations between N named models based on topics, heuristic signals, and custom rules. Simple turns stay on the fast model for low latency; complex, friction-heavy, or topic-sensitive turns escalate to higher-tier models automatically.

The Switchboard is a drop-in llm.LLM replacement — no extra wiring needed.

Installation

pip install livekit-ai-switchboard

Requires Python 3.10+ and livekit-agents >= 1.0

Quick Start

from ai_switchboard import Switchboard, SwitchboardConfig
from livekit.agents.pipeline import VoicePipelineAgent
from livekit.plugins import groq, openai

switchboard = Switchboard(
    models={
        "fast": groq.LLM(model="llama-3.3-70b-versatile"),
        "smart": openai.LLM(model="gpt-4o"),
    },
    config=SwitchboardConfig(
        model_topics={"smart": ["pricing", "billing"]},
    ),
)

agent = VoicePipelineAgent(llm=switchboard, ...)

That's it — "pricing" and "billing" messages route to the smart model, everything else stays fast.

Three Layers of Routing

Use as few or as many as you need. Each layer is additive.

1. Topics — config only, zero code

Map keywords to models. Any message containing a keyword routes to that model.

Switchboard(
    models={"fast": fast_llm, "standard": std_llm, "premium": premium_llm},
    config=SwitchboardConfig(
        model_topics={
            "premium": ["pricing", "billing", "legal"],
            "standard": ["complaint", "support"],
        },
    ),
)

2. Heuristic Escalation — config only

Auto-escalate when the conversation gets complex, frustrated, or heated. The Switchboard scores each turn using built-in signal detectors and escalates when the score exceeds your threshold.

SwitchboardConfig(
    escalation_model="premium",
    escalation_threshold=0.6,
)

3. Rules — opt-in, full control

Lambda conditions for custom logic — VIP routing, time-of-day, metadata checks, etc. Rules take highest priority and bypass cooldown.

from ai_switchboard import Rule

rules = [
    Rule(
        name="vip_customer",
        condition=lambda ctx: "vip" in ctx.last_message.lower(),
        use="premium",
        priority=10,
    ),
]

How Routing Works

Every turn, the Switchboard evaluates in this order:

Rules (highest priority first, first match wins)
  ↓ no match
Topic keywords + Heuristic score → pick the higher-tier model
  ↓ nothing triggered
Default model (first in your models dict)
  ↓
Cooldown guard (holds current model for N turns after a switch)
  ↓
Forward chat() to the chosen model

Model tier is determined by insertion order — later in the dict means higher tier.

Heuristic Signals

Signal Category Trigger Weight
long_input Complexity Word count > 25 0.20
multi_question Complexity 2+ question marks 0.30
complexity_words Complexity "explain", "compare", "why"... 0.20
pushback Friction "no", "that's wrong", "incorrect"... 0.40
frustration Friction "confused", "doesn't make sense"... 0.35
repeat_request Friction "say that again", "can you repeat"... 0.25
interruption Conversation User cut agent off 0.20
low_stt_confidence Voice STT confidence below threshold 0.30
long_audio_turn Voice Audio duration above threshold 0.15
topic_match Topic Developer-defined keyword hit 0.50

Weights are summed and capped at 1.0. Default escalation threshold: 0.60.

Full Configuration

from ai_switchboard import Switchboard, SwitchboardConfig

sb = Switchboard(
    models={
        "fast": fast_llm,
        "standard": std_llm,
        "premium": premium_llm,
    },
    config=SwitchboardConfig(
        default_model="fast",               # fallback (default: first in dict)
        cooldown_turns=2,                    # hold after switch before de-escalating

        # Topic routing
        model_topics={
            "premium": ["pricing", "billing"],
            "standard": ["complaint", "support"],
        },

        # Heuristic escalation
        escalation_model="premium",          # escalate to this when score is high
        escalation_threshold=0.6,            # score threshold (0.0–1.0)

        # Voice thresholds
        stt_confidence_threshold=0.7,        # below → low_stt_confidence signal
        long_audio_threshold=10.0,           # seconds above → long_audio_turn signal

        # Observability
        on_switch=my_switch_callback,        # fires only on model change
        on_decision=my_decision_callback,    # fires every turn
        log_decisions=True,                  # log to "ai_switchboard" logger
    ),
)

Voice Signals

The Switchboard detects voice-specific signals from LiveKit's STT pipeline.

STT Confidence — low transcription confidence suggests ambiguous input that a smarter model may handle better:

SwitchboardConfig(stt_confidence_threshold=0.7)  # default

Audio Duration — long audio turns often indicate complex input:

SwitchboardConfig(long_audio_threshold=10.0)  # seconds, default

Feed audio duration from your pipeline:

switchboard.record_audio_duration(seconds=12.5)

Observability

Two callbacks for different use cases:

from ai_switchboard import SwitchEvent

def on_switch(event: SwitchEvent):
    """Fires only when the model changes."""
    print(f"Switched {event.from_model}{event.to_model}")

def on_decision(event: SwitchEvent):
    """Fires every turn, whether or not the model changed."""
    print(f"Turn {event.turn}: model={event.to_model} score={event.heuristic_score:.2f}")

Every SwitchEvent includes: from_model, to_model, turn, heuristic_score, signals_fired, triggered_by, and changed.

Rule Context

Every rule condition receives a Context object:

Field Type Description
last_message str Raw user message text
last_message_word_count int Word count
turn_count int Total turns so far
interruption_count int Interruptions this turn
repeat_request_count int Accumulated repeat requests
current_model str Name of current model
turns_on_current_model int Turns since last switch
last_switch_turn int Turn number of last switch
stt_confidence float | None STT transcript confidence
audio_duration float | None Audio duration in seconds
signals_fired list[str] Signals detected this turn
heuristic_score float Weighted score (0.0–1.0)

API Reference

Switchboard

Switchboard(
    models: dict[str, llm.LLM],     # named models, insertion order = tier order
    config: SwitchboardConfig = ..., # routing configuration
    rules: list[Rule] = [],          # custom routing rules
)
Property / Method Description
switchboard.current_model Name of the active model (str)
switchboard.model Underlying LLM's model identifier
switchboard.default_model Name of the fallback model
switchboard.record_interruption() Signal that the user interrupted
switchboard.record_audio_duration(seconds) Feed audio turn length
switchboard.reset() Clear all state for a new session

Rule

Rule(
    name: str,                               # identifier for logging
    condition: Callable[[Context], bool],     # evaluated each turn
    use: str,                                # target model name
    priority: int = 5,                       # higher = evaluated first
)

Examples

See the examples/ directory:

File Description
basic.py Minimal 2-model setup with topic routing
three_tier.py 3-model setup with topics + heuristic escalation
custom_rules.py Rules for VIP routing and custom logic
openrouter_anthropic.py Groq + Anthropic via OpenRouter
demo_agent_session.py LiveKit AgentSession testing pattern
demo_events.py Observability with on_switch / on_decision

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livekit_ai_switchboard-0.0.1.tar.gz (22.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livekit_ai_switchboard-0.0.1-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file livekit_ai_switchboard-0.0.1.tar.gz.

File metadata

  • Download URL: livekit_ai_switchboard-0.0.1.tar.gz
  • Upload date:
  • Size: 22.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for livekit_ai_switchboard-0.0.1.tar.gz
Algorithm Hash digest
SHA256 fc9d4418cab8da69316edc1cbaf53b197743b45fe1462e62038dedc7c03510ab
MD5 aebf36609f1db3756ee1e37e8c370d89
BLAKE2b-256 c82cdc1dbbd9260c7084daba9518034aed83dcf7a564ca73146f7cb6027a552f

See more details on using hashes here.

File details

Details for the file livekit_ai_switchboard-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for livekit_ai_switchboard-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 91ebe0f48a31e272512bf463e7239e624c099b3e47d303903d55a502d8090319
MD5 8fe9a30349f8833b7e5e9020a2ff399b
BLAKE2b-256 1cf782e7c9bb832a406c99d061a805790b6eb0e459cfe40fdfef6b970e2ab959

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page