Cut your Claude API costs by 60% with intelligent model routing.
Project description
Nimer Optimizer
Cut your Claude API bill by 60% with intelligent model routing.
A drop-in replacement for the Anthropic Python SDK that routes each request to the cheapest Claude model that can handle it — without sacrificing quality.
The problem
Most developers default to Sonnet — or worse, Opus — for every Claude API call. The price difference is dramatic:
| Model | Input ($/1M tokens) | Output ($/1M tokens) | Relative cost |
|---|---|---|---|
| Haiku | $0.25 | $1.25 | 1× |
| Sonnet | $3.00 | $15.00 | 12× |
| Opus | $15.00 | $75.00 | 60× |
Most tasks — classification, lookups, simple Q&A, short summaries — work just as well on Haiku. The result is monthly bills that should be $80 turning into $400.
The fix
Nimer Optimizer analyzes each prompt — length, content type, code presence — and routes it to the cheapest model that can handle it. Same quality, fraction of the cost.
Quick start
pip install nimer
from nimer import OptimizedClaude
client = OptimizedClaude(
anthropic_api_key="sk-ant-...",
nimer_api_key="nm_...", # optional — enables dashboard logging
)
response = client.messages.create(
max_tokens=512,
messages=[{"role": "user", "content": "Translate 'good morning' to Arabic."}],
)
Three lines of config. Automatic routing. Real savings.
Migrating from anthropic
Change one import:
- from anthropic import Anthropic
+ from nimer import OptimizedClaude as Anthropic
Everything else — messages.create, system prompts, streaming, tool use, multi-modal content — works identically.
How routing works
Deterministic, explainable rules. No AI deciding which AI to use.
| Condition | Routed to |
|---|---|
| Total input > 5,000 chars | Sonnet |
| Code request + last message > 500 chars | Opus |
| Last message < 200 chars | Haiku |
| Everything else | Sonnet |
You can:
- Override per call:
client.messages.create(model="claude-opus-4-7", auto_route=False, ...) - Tune thresholds: pass a custom
Routerinstance with different limits - Inspect decisions: every routing call is logged to your dashboard with the chosen model and estimated savings
What we log (and what we don't)
If you set nimer_api_key, the SDK sends:
- Token counts (input + output)
- Which model was selected
- Estimated savings (USD)
- Timestamp
The SDK never sends:
- Your prompts
- Your responses
- Your Anthropic API key
This is enforced in code, not policy. Read nimer/logger.py — it's 80 lines.
Pricing
| Plan | Price | Includes |
|---|---|---|
| Free | $0/mo | 1,000 requests/month, basic routing, 7-day analytics |
| Pro | $29/mo | 100K requests, advanced routing, 90-day analytics, budget alerts |
| Scale | $99/mo | Unlimited requests, custom routing rules, team features |
The SDK itself is open source and free forever. The paid tiers add the dashboard, analytics, and managed routing rules.
Status
🚧 Alpha — public launch in 6 weeks.
- Core SDK with rule-based routing
- Cost & savings calculations
- Async metadata logging
- Multi-modal content support
- Backend API
- Dashboard UI
- Closed beta (week 5)
- Public launch (week 6)
Follow @bynimer for weekly progress updates, or join the waitlist at nimer.dev.
Development
git clone https://github.com/nimer-dev/optimizer-sdk
cd optimizer-sdk
pip install -e ".[dev]"
pytest
The router is the most important piece — tests/test_router.py covers the routing rules.
FAQ
Is this actually a drop-in replacement?
Yes. The client.messages.create(...) signature matches the Anthropic SDK. The only addition is auto_route=True, which defaults to on. Set it to False and pass model= to keep the original behavior.
What if your routing picks the wrong model?
Override on a per-call basis with auto_route=False, model="...". The default is optimized for cost; you keep full control when you need it.
How is this different from Helicone or Langfuse? Those are full observability platforms — they log everything and offer rich tracing. Nimer focuses on one thing: cost-aware routing for Claude. If you need full LLM observability, use those. If you want to cut your bill in half with three lines of code, use this.
Will this work with streaming, tool use, or vision? Yes. The SDK forwards everything to the underlying Anthropic client unchanged.
Why Claude only? Because narrow beats broad in v1. Once we're great at Claude, we'll consider OpenAI and Gemini.
Where are you based? Saudi Arabia. Building globally.
Founder
Built by Majdi — a developer who turns problems into products. Started by watching his Claude API bill climb past $400/month and building a router to fix it. Named after his son Nimer, because the best things you build are for the people you love.
This is the first product. Not the last.
⭐ Star this repo if you've ever overspent on Claude API.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nimer-0.2.1.tar.gz.
File metadata
- Download URL: nimer-0.2.1.tar.gz
- Upload date:
- Size: 33.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c81dfd1cb09c8c69714f8a4a8d4448149833b9fa1e040b4613c5ae54070c01d2
|
|
| MD5 |
7dccae813e94645e353c24d6e8799410
|
|
| BLAKE2b-256 |
d1d9a156bbfa9aecc39601d940acc07cf782e4b8bff92c3ac10851fd62b92cf3
|
File details
Details for the file nimer-0.2.1-py3-none-any.whl.
File metadata
- Download URL: nimer-0.2.1-py3-none-any.whl
- Upload date:
- Size: 29.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
118fb4a2d99c345ac1e4ab2c110d8cc55d4a5f78d8f4bf9a2d0e0f0b7442dbac
|
|
| MD5 |
703a6943db9ee5f24d83db81d011befc
|
|
| BLAKE2b-256 |
838ee8f0d33b63953a8839db436f361a60c06b42d2c69ba4a91ed1a5be473a04
|