Skip to main content

SOTA LLM router — 98% accuracy, cascade classifier, <1ms local routing

Project description

UncommonRoute

SOTA LLM Router — 98% accuracy, <1ms local routing

Route every LLM request to the optimal model.
39-feature cascade classifier, session persistence, spend control.
Pure local — no external API calls for routing decisions.

98% accuracy  <1ms  Local  OpenAI compatible



Python 3.11+ License: MIT Tests OpenClaw Plugin


Quick Navigation

Section Description
Quick Start Install in 30 seconds
Usage Modes CLI, SDK, Proxy, OpenClaw
How It Works Cascade classifier architecture
Routing Tiers SIMPLE → MEDIUM → COMPLEX → REASONING
Session Management Sticky routing, auto-escalation
Spend Control Per-request, hourly, daily limits
Models & Pricing Supported models and costs
Configuration Environment variables
Benchmarks Accuracy & latency results

Quick Start

One-line install:

curl -fsSL https://anjieyang.github.io/uncommon-route/install | bash

Or with pip:

pip install uncommon-route

Or as an OpenClaw plugin:

openclaw plugins install @anjieyang/uncommon-route
openclaw gateway restart
# Done — smart routing is automatic

Usage Modes

1. CLI

uncommon-route route "what is 2+2"
# Model: moonshot/kimi-k2.5  Tier: SIMPLE  Savings: 97%

uncommon-route route --json "design a distributed database"
# Full JSON with model, tier, confidence, cost, fallback chain

uncommon-route debug "explain quicksort"
# Per-dimension scoring breakdown (structural + keyword + unicode)

2. Python SDK

from uncommon_route import route, classify

decision = route("explain the Byzantine Generals Problem")
print(decision.model)       # google/gemini-3.1-pro
print(decision.tier)        # COMPLEX
print(decision.confidence)  # 0.87
print(decision.savings)     # 0.76

# Classification only (no model selection)
result = classify("hello")
print(result.tier)          # SIMPLE
print(result.signals)       # ['short_prompt', 'greeting_pattern']

3. HTTP Proxy (OpenAI-compatible)

uncommon-route serve --port 8403

Works with any OpenAI SDK client:

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:8403/v1",
    api_key="your-upstream-key",
)

response = client.chat.completions.create(
    model="uncommon-route/auto",   # smart routing
    messages=[{"role": "user", "content": "hello"}],
)
Endpoint Method Description
/v1/chat/completions POST Chat with smart routing
/v1/models GET Available models
/v1/spend GET/POST Spend control
/v1/sessions GET Active sessions
/health GET Health + status

4. OpenClaw Plugin

openclaw plugins install @anjieyang/uncommon-route

The plugin auto-installs Python dependencies, starts the proxy, and registers everything. Available commands in OpenClaw:

Command Description
/route <prompt> Preview routing decision
/spend status View spending limits
/spend set hourly 5.00 Set hourly limit
/sessions View active sessions

How It Works

UncommonRoute uses a cascade classifier with three levels:

Input Prompt
     │
     ▼
┌─────────────────────┐
│ 1. Trivial Override  │  greeting / empty / very long → instant decision
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│ 2. Learned Model    │  Averaged Perceptron on 39 features
│    (356µs avg)      │  12 structural + 15 unicode + 12 keyword
└─────────┬───────────┘
          │
          ▼
┌─────────────────────┐
│ 3. Rule Fallback    │  hand-tuned weights when model unavailable
└─────────┬───────────┘
          │
          ▼
    Tier + Model + Cost

Feature Groups (39 total)

Structural (12): normalized_length, enumeration_density, sentence_count, code_markers, math_symbols, nesting_depth, vocabulary_diversity, avg_word_length, alphabetic_ratio, functional_intent, unique_concept_density, requirement_phrases

Unicode (15): basic_latin, latin_ext, cyrillic, arabic, devanagari, thai, hangul_jamo, cjk_unified, hiragana, katakana, hangul_syllables, punctuation, digits, symbols_math

Keyword (12): code_presence, reasoning_markers, technical_terms, creative_markers, simple_indicators, imperative_verbs, constraint_count, output_format, domain_specificity, agentic_task, analytical_verbs, multi_step_patterns


Routing Tiers

Tier When Default Model Example
SIMPLE Greetings, lookups, translations moonshot/kimi-k2.5 "what is 2+2"
MEDIUM Code tasks, explanations, summaries moonshot/kimi-k2.5 "explain quicksort"
COMPLEX Multi-requirement system design google/gemini-3.1-pro "design a distributed DB with these 5 requirements..."
REASONING Formal proofs, mathematical derivations xai/grok-4-1-fast-reasoning "prove sqrt(2) is irrational"

Session Management

Sessions prevent model switching mid-task. When enabled (default):

  • Sticky routing — same session ID → same model across requests
  • 30-minute timeout — sessions auto-expire after inactivity
  • Three-strike escalation — 3 identical failed requests → auto-upgrade to next tier
# Sessions work via header
headers = {"X-Session-ID": "my-task-123"}

# Or auto-derived from first user message

Spend Control

Set spending limits to prevent runaway costs:

uncommon-route spend set per_request 0.10   # max $0.10 per call
uncommon-route spend set hourly 5.00        # max $5/hour
uncommon-route spend set daily 20.00        # max $20/day
uncommon-route spend set session 3.00       # max $3/session
uncommon-route spend status                 # view current spending
uncommon-route spend history                # recent records

When a limit is hit, the proxy returns HTTP 429 with reset_in_seconds.

Data persists at ~/.uncommon-route/spending.json.


Models & Pricing

Model Input ($/1M) Output ($/1M) Tier
nvidia/gpt-oss-120b $0.00 $0.00 SIMPLE fallback
google/gemini-2.5-flash-lite $0.10 $0.40 SIMPLE fallback
deepseek/deepseek-chat $0.28 $0.42 MEDIUM fallback
xai/grok-4-1-fast-reasoning $0.20 $0.50 REASONING primary
moonshot/kimi-k2.5 $0.60 $3.00 SIMPLE/MEDIUM primary
google/gemini-3.1-pro $2.00 $12.00 COMPLEX primary
openai/gpt-5.2 $1.75 $14.00 COMPLEX fallback
anthropic/claude-sonnet-4.6 $3.00 $15.00 COMPLEX fallback

Baseline comparison: anthropic/claude-opus-4.6 at $5.00/$25.00 per 1M tokens.


Configuration

Environment Variables

Variable Default Description
COMMONSTACK_API_KEY Commonstack API key (default upstream)
UNCOMMON_ROUTE_UPSTREAM https://api.commonstack.ai/v1 Upstream API URL
UNCOMMON_ROUTE_PORT 8403 Proxy port
UNCOMMON_ROUTE_DISABLED false Disable routing (passthrough)

OpenClaw Plugin Config

plugins:
  entries:
    "@anjieyang/uncommon-route":
      port: 8403
      upstream: "https://api.commonstack.ai/v1"
      spendLimits:
        hourly: 5.00
        daily: 20.00

Benchmarks

Evaluated on 2000+ multilingual prompts across 10 languages (EN, ZH, KO, JA, ES, PT, AR, RU, DE, HI):

Metric Value
Overall Accuracy 98.4%
Average Latency 356µs
Features 39 (structural + unicode + keyword)
Learning Averaged Perceptron
External API Calls None (pure local)

Per-Tier F1 Scores

Tier F1
SIMPLE 0.988
MEDIUM 0.968
COMPLEX 0.987
REASONING 1.000

Run the benchmark suite yourself:

cd bench && python run.py

Project Structure

├── uncommon_route/           # Core package
│   ├── router/               # Cascade classifier + model selection
│   │   ├── classifier.py     # Three-level cascade
│   │   ├── learned.py        # Averaged Perceptron (ScriptAgnosticClassifier)
│   │   ├── structural.py     # 12 structural + 15 unicode features
│   │   ├── keywords.py       # 12 keyword features
│   │   ├── selector.py       # Tier → model + fallback chain
│   │   └── model.json        # Trained weights
│   ├── proxy.py              # OpenAI-compatible ASGI proxy
│   ├── session.py            # Session persistence + escalation
│   ├── spend_control.py      # Time-windowed spending limits
│   ├── providers.py          # BYOK provider management
│   ├── openclaw.py           # OpenClaw config integration
│   └── cli.py                # CLI entry point
├── openclaw-plugin/          # JS bridge for OpenClaw
│   ├── src/index.js          # Auto-install + lifecycle management
│   ├── package.json          # @anjieyang/uncommon-route
│   └── openclaw.plugin.json  # Plugin manifest
├── tests/                    # 98 tests (unit + integration + E2E)
├── bench/                    # Benchmarking suite + datasets
├── scripts/install.sh        # One-line installer
└── pyproject.toml            # PyPI-ready packaging

Development

git clone https://github.com/anjieyang/UncommonRoute.git
cd UncommonRoute
pip install -e ".[dev]"
python -m pytest tests/ -v

License

MIT — see LICENSE.


Built by Anjie Yang

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

uncommon_route-0.1.0.tar.gz (495.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

uncommon_route-0.1.0-py3-none-any.whl (488.1 kB view details)

Uploaded Python 3

File details

Details for the file uncommon_route-0.1.0.tar.gz.

File metadata

  • Download URL: uncommon_route-0.1.0.tar.gz
  • Upload date:
  • Size: 495.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for uncommon_route-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7985a379cd69779d824997f1636414e6714cb73593492513d157e247e820e0dc
MD5 983cc7cdd046de323dea93e515043fba
BLAKE2b-256 96172040100f69ecccfc834937263418df23d7eb06cfe9d34d05755c021f7f89

See more details on using hashes here.

File details

Details for the file uncommon_route-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: uncommon_route-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 488.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for uncommon_route-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9dc5655627c2d0782041a23f62f6a999f1fe30c6e99595ad63a39f7b61f0716b
MD5 858b16b5526e6adef628c7204bbd09a6
BLAKE2b-256 6e2907516bc673d0184d5c2d4ccb81f84d8c2e2df9cfd965c839545d4d247a71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page