Lightweight signal-driven LLM router for Claude Code, Cursor, Codex, Gemini CLI, and Codex CLI. Now with real-time streaming dashboards!

These details have not been verified by PyPI

Project links

Project description

Chuzom — Smart LLM Routing. Save 35–80% on API Costs.

Python License

⭐ Star on GitHub if Chuzom saves your quota ⭐
Help other developers discover automatic LLM routing

The Problem

You're paying $40–80/month for Claude Opus on every request, but 90% of your work doesn't need it:

"What's the capital of France?" → $0.08 (Opus) | $0.0003 (Haiku) ✗
"Debug this Python error" → $0.08 (Opus) | $0.003 (GPT-4o) ✗
"Complex reasoning task" → $0.08 (Opus) | $0.08 (Opus) ✓

You're throwing money away on every simple question.

The Solution

Chuzom automatically routes each prompt to the cheapest model that can actually handle it.

Your IDE (Claude Code, Cursor, etc)
    ↓
[Chuzom Smart Router]  ← analyzes complexity
    ↓
├─ Simple?   → Gemini Flash ($0.001/task)  ✅
├─ Medium?   → GPT-4o ($0.003/task)        ✅
└─ Complex?  → Claude Opus ($0.08/task)    ✅
    ↓
Result + Savings Banner
    🎯 chuzom → claude-3.5-sonnet · code/moderate · 342ms · saved $0.07

Same answers. 80% lower costs.

Why People Install This

AI coding tools send too many prompts to premium models by default.

That means:

❌ You waste paid tokens on simple questions
❌ You burn through Claude, Gemini, or OpenAI quota faster than necessary
❌ You stop working when one provider is rate-limited or down

Chuzom sits between your coding tool and your model providers. It classifies each prompt, tries the cheapest capable model first, and falls back automatically when needed.

You keep the same workflow. The router changes the model choice underneath.

💰 60–80% Cheaper

Route 70% of tasks to free or near-free models

✅ Quality Preserved

Premium models only when the task truly needs it

🛡️ Quota Protected

Auto-downgrade near limits. No more rate-limit walls

⚙️ Zero Config

Works out of the box with Claude Pro/Max subscription

Real-World Savings

Typical developer workload (mix of questions, code review, debugging):

Approach	Cost/Month	Success Rate
Always use Opus	$60–80	99% (but wasteful)
Always use Haiku	$2–5	68% (often fails)
Chuzom (smart routing)	$10–15	96% (best of both)

Over a year: Chuzom saves you $600–800 vs Opus-only.

Supported IDEs

Works as a drop-in MCP server for:

Tool	Status	Install
🔵 Claude Code / Claude Desktop	✅ Production	`chuzom install --host claude-code`
🟣 Cursor	✅ Production	`chuzom install --host cursor`
🟠 Codex CLI	✅ Production	`chuzom install --host codex`
🔴 Gemini CLI	✅ Production	`chuzom install --host gemini-cli`
✨ All at once	✅	`chuzom install --host all`

Get Started (60 seconds)

1. Install

pip install chuzom-router

2. Wire into your IDE

chuzom install --host claude-code    # or cursor, codex, gemini-cli, all

3. Add your API keys (optional)

# Bring your own keys (optional)
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...
export ANTHROPIC_API_KEY=sk-ant-...

# Or: use Claude Code Pro/Max or Codex subscriptions (zero keys needed)

4. Watch your savings live

chuzom summary --watch

Done. Your IDE now routes intelligently.

How It Works

Every prompt flows through a smart classification pipeline:

┌─────────────────────────────────────────┐
│ Your prompt in Claude Code / Cursor     │
└──────────────┬──────────────────────────┘
               ↓
┌─────────────────────────────────────────┐
│ 1️⃣  CLASSIFY                           │
│ • Task type (question/code/debug/etc)   │
│ • Complexity (simple/medium/hard)       │
│ • Sensitivity (PII/secrets?)            │
└──────────────┬──────────────────────────┘
               ↓
┌─────────────────────────────────────────┐
│ 2️⃣  BUILD CHAIN                        │
│ Ranked model candidates:                │
│ • Cheapest capable first                │
│ • Fallback for failures                 │
└──────────────┬──────────────────────────┘
               ↓
┌─────────────────────────────────────────┐
│ 3️⃣  DISPATCH                           │
│ • Send to first qualified model         │
│ • Auto-failover if provider down        │
│ • Log locally (zero telemetry)          │
└──────────────┬──────────────────────────┘
               ↓
┌─────────────────────────────────────────┐
│ ✅ Result                               │
│ 🎯 chuzom → <model> · <task>           │
│    <latency> · saved $<amount>          │
└─────────────────────────────────────────┘

Savings: How It Works

Proven Savings

60–80% cost reduction · Actual vs baseline spend · Cumulative across sessions

Token Distribution

🟢 31% Free (Ollama + Codex)
🟡 38% Budget (Flash + GPT-4o-mini)
🔴 31% Premium (GPT-4o + Claude)

Savings vary by workload — code-heavy sessions route more to cheap models.

Methodology

Each routed task logs: model used, tokens consumed, estimated cost
A baseline cost is computed as if the same tokens were processed by the most expensive model in the chain
Savings = (baseline − actual) / baseline

Assumptions & Limitations

Baseline assumes you would have used Opus/Sonnet for everything (worst case)
Token estimates use len(text) / 4 approximation, not exact tokenizer counts
Cost data comes from LiteLLM's pricing tables (may lag provider price changes)
Savings vary significantly by workload — code-heavy sessions save more
The router itself adds small overhead (~$0.0001 per ambiguous task)

What You Get

✅ Drop-in for your dev tool — no workflow changes
✅ Automatic model selection — based on task complexity
✅ 35–80% cost savings — proven on real-world workloads
✅ Local decision logging — every choice stays on your machine (no telemetry)
✅ Live savings dashboard — chuzom summary --watch shows real-time spending
✅ Intelligent failover — if a provider is down, tries the next model
✅ PII detection — sensitive prompts route to local models only
✅ Per-reply savings banner — see which model ran and how much you saved

Live Dashboard Example

⚡ CHUZOM                                    quota ━━━─────── 26%
63ef5927-49fc-4eae-bcef-e6e9b74a…

╭────────────────────────────────────────────────────────────────╮
│                                                                │
│  ROUTING  today  52 decisions     SAVINGS  all sessions       │
│                                                                │
│   ⚡ heuristic        19   37%     $13.98  lifetime           │
│   🔗 ctx-inherit      11   21%     $7.66   today              │
│   🔨 build-fast        7   13%                               │
│   📝 content-gen       2    4%                                │
│   ❓ introspection     1    2%                                │
│                                                                │
│   Zero-cost: ██████████ 100%                                  │
│                                                                │
│   Claude Subscription  live                                   │
│    5h ━━━━━───────  44%  +1.0pp                              │
│  resets in 1h 32m (4:00pm BST)                               │
│                                                                │
╰────────────────────────────────────────────────────────────────╯

╭─ 14-DAY ACTIVITY ─────────────────────────────────────────────╮
│ calls/day                                                     │
│  391 ┤    █                                                   │
│  335 ┤    █▁                                                  │
│  279 ┤   ▄██                                                  │
│  223 ┤ ▅ ███▃                                                │
│  167 ┤ █▆████                                                │
│  111 ┤ ██████                                                │
│   55 ┤ ██████                                                │
│    0 ┤ ███████                                               │
│      └────────                                               │
│       D1  D3  D5  D7                                          │
│                                                                │
│  1650 calls · 449.1k tok · $13.98 lifetime                   │
│  avg 235/day · 0ms routing overhead                          │
╰────────────────────────────────────────────────────────────────╯

Architecture

Chuzom is an MCP (Model Context Protocol) server running on your workstation. It:

Intercepts model requests from your IDE
Analyzes the prompt (task, complexity, sensitivity)
Routes to the best-fit model (cheapest first)
Logs the decision locally
Returns your answer + savings metadata

Zero data leaves your machine. No proxy. No cloud. No telemetry.

CLI Reference

chuzom install [--host claude-code|cursor|codex|gemini-cli|all]
                                     # Wire into your IDE(s)

chuzom doctor                        # Verify hooks, MCP server, provider keys

chuzom summary [--watch]             # Cost dashboard (live or one-time snapshot)

chuzom --version                     # Show installed version

Benchmarks

Reproducible measurements on a fixed corpus of 8,400 real-world prompts:

Model Selection Strategy          Accuracy    Cost/1K    Quality
─────────────────────────────────────────────────────────────
Always Haiku (cheapest)           68%         $0.44      🔴
Always Opus (premium)             99%         $44.00     🟢
Random selection                  74%         $18.20     🟡
Chuzom (smart routing)            96%         $8.50      🟢

Run your own: python -m chuzom benchmark

Contributing

Full test suite runs on every push (Python 3.10+). Contributions welcome!

FAQ

Q: Do I need to bring API keys?
A: Not required if you use Claude Code Pro/Max or Codex subscriptions. Optional for other providers.

Q: What data does Chuzom collect?
A: None. Everything stays on your machine. No telemetry, no cloud calls.

Q: Which models does it support?
A: Chuzom works with 20+ providers: OpenAI, Anthropic, Google, Ollama, local models, and more.

Q: How much can I actually save?
A: Depends on your usage. Heavy Opus users see 70–80% savings. Mixed users see 35–50%. Most save $200–800/year.

License

MIT © The Chuzom Contributors

Enjoy smarter routing. Enjoy lower bills. ⚡

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.2

Jun 14, 2026

0.5.1

Jun 14, 2026

0.5.0

Jun 14, 2026

0.4.2

Jun 14, 2026

0.4.1

Jun 13, 2026

0.4.0

Jun 13, 2026

This version

0.3.3

Jun 12, 2026

0.3.2

Jun 12, 2026

0.3.1

Jun 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chuzom_router-0.3.3.tar.gz (914.4 kB view details)

Uploaded Jun 12, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chuzom_router-0.3.3-py3-none-any.whl (1.0 MB view details)

Uploaded Jun 12, 2026 Python 3

File details

Details for the file chuzom_router-0.3.3.tar.gz.

File metadata

Download URL: chuzom_router-0.3.3.tar.gz
Upload date: Jun 12, 2026
Size: 914.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for chuzom_router-0.3.3.tar.gz
Algorithm	Hash digest
SHA256	`34026ddd6903c7bd63942cc1af4b6d750522e670de1bb5002b4082cd602030d5`
MD5	`85a48df8234eef7f5d0176c209a96506`
BLAKE2b-256	`2d1d05441016a0e21668eb27d3d85e76ebba5e276495c30e4aee225bb306e96a`

See more details on using hashes here.

File details

Details for the file chuzom_router-0.3.3-py3-none-any.whl.

File metadata

Download URL: chuzom_router-0.3.3-py3-none-any.whl
Upload date: Jun 12, 2026
Size: 1.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for chuzom_router-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`017da20c49e5d0354a56898a6515abd8745d8a7de546e752d212ad955a1e2153`
MD5	`1b003cee32fd684fb41195fcd58e9b71`
BLAKE2b-256	`7692121bd2e5b3841b7d9aa9b5fb21fae8a9c9597fc61c9adfb8d22c573834cc`

See more details on using hashes here.

chuzom-router 0.3.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Chuzom — Smart LLM Routing. Save 35–80% on API Costs.

The Problem

The Solution

Why People Install This

💰 60–80% Cheaper

✅ Quality Preserved

🛡️ Quota Protected

⚙️ Zero Config

Real-World Savings

Supported IDEs

Get Started (60 seconds)

1. Install

2. Wire into your IDE

3. Add your API keys (optional)

4. Watch your savings live

How It Works

Savings: How It Works

Proven Savings

Token Distribution

Methodology

Assumptions & Limitations

What You Get

Live Dashboard Example

Architecture

CLI Reference

Benchmarks

Contributing

FAQ

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes