Skip to main content

Cut your Claude API costs by 60% with intelligent model routing.

Project description

Nimer Optimizer

Cut your Claude API bill by 60% with intelligent model routing.

A drop-in replacement for the Anthropic Python SDK that routes each request to the cheapest Claude model that can handle it — without sacrificing quality.

Twitter Status License

The problem

Most developers default to Sonnet — or worse, Opus — for every Claude API call. The price difference is dramatic:

Model Input ($/1M tokens) Output ($/1M tokens) Relative cost
Haiku $0.25 $1.25
Sonnet $3.00 $15.00 12×
Opus $15.00 $75.00 60×

Most tasks — classification, lookups, simple Q&A, short summaries — work just as well on Haiku. The result is monthly bills that should be $80 turning into $400.

The fix

Nimer Optimizer analyzes each prompt — length, content type, code presence — and routes it to the cheapest model that can handle it. Same quality, fraction of the cost.

Quick start

pip install nimer
from nimer import OptimizedClaude

client = OptimizedClaude(
    anthropic_api_key="sk-ant-...",
    nimer_api_key="nm_...",  # optional — enables dashboard logging
)

response = client.messages.create(
    max_tokens=512,
    messages=[{"role": "user", "content": "Translate 'good morning' to Arabic."}],
)

Three lines of config. Automatic routing. Real savings.

Migrating from anthropic

Change one import:

- from anthropic import Anthropic
+ from nimer import OptimizedClaude as Anthropic

Everything else — messages.create, system prompts, streaming, tool use, multi-modal content — works identically.

How routing works

Deterministic, explainable rules. No AI deciding which AI to use.

Condition Routed to
Total input > 5,000 chars Sonnet
Code request + last message > 500 chars Opus
Last message < 200 chars Haiku
Everything else Sonnet

You can:

  • Override per call: client.messages.create(model="claude-opus-4-7", auto_route=False, ...)
  • Tune thresholds: pass a custom Router instance with different limits
  • Inspect decisions: every routing call is logged to your dashboard with the chosen model and estimated savings

What we log (and what we don't)

If you set nimer_api_key, the SDK sends:

  • Token counts (input + output)
  • Which model was selected
  • Estimated savings (USD)
  • Timestamp

The SDK never sends:

  • Your prompts
  • Your responses
  • Your Anthropic API key

This is enforced in code, not policy. Read nimer/logger.py — it's 80 lines.

Pricing

Plan Price Includes
Free $0/mo 1,000 requests/month, basic routing, 7-day analytics
Pro $29/mo 100K requests, advanced routing, 90-day analytics, budget alerts
Scale $99/mo Unlimited requests, custom routing rules, team features

The SDK itself is open source and free forever. The paid tiers add the dashboard, analytics, and managed routing rules.

Status

🚧 Alpha — public launch in 6 weeks.

  • Core SDK with rule-based routing
  • Cost & savings calculations
  • Async metadata logging
  • Multi-modal content support
  • Backend API
  • Dashboard UI
  • Closed beta (week 5)
  • Public launch (week 6)

Follow @bynimer for weekly progress updates, or join the waitlist at nimer.dev.

Development

git clone https://github.com/nimer-dev/optimizer-sdk
cd optimizer-sdk
pip install -e ".[dev]"
pytest

The router is the most important piece — tests/test_router.py covers the routing rules.

FAQ

Is this actually a drop-in replacement? Yes. The client.messages.create(...) signature matches the Anthropic SDK. The only addition is auto_route=True, which defaults to on. Set it to False and pass model= to keep the original behavior.

What if your routing picks the wrong model? Override on a per-call basis with auto_route=False, model="...". The default is optimized for cost; you keep full control when you need it.

How is this different from Helicone or Langfuse? Those are full observability platforms — they log everything and offer rich tracing. Nimer focuses on one thing: cost-aware routing for Claude. If you need full LLM observability, use those. If you want to cut your bill in half with three lines of code, use this.

Will this work with streaming, tool use, or vision? Yes. The SDK forwards everything to the underlying Anthropic client unchanged.

Why Claude only? Because narrow beats broad in v1. Once we're great at Claude, we'll consider OpenAI and Gemini.

Where are you based? Saudi Arabia. Building globally.

Founder

Built by Majdi — a developer who turns problems into products. Started by watching his Claude API bill climb past $400/month and building a router to fix it. Named after his son Nimer, because the best things you build are for the people you love.

This is the first product. Not the last.

⭐ Star this repo if you've ever overspent on Claude API.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nimer-0.2.1.tar.gz (33.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nimer-0.2.1-py3-none-any.whl (29.7 kB view details)

Uploaded Python 3

File details

Details for the file nimer-0.2.1.tar.gz.

File metadata

  • Download URL: nimer-0.2.1.tar.gz
  • Upload date:
  • Size: 33.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nimer-0.2.1.tar.gz
Algorithm Hash digest
SHA256 c81dfd1cb09c8c69714f8a4a8d4448149833b9fa1e040b4613c5ae54070c01d2
MD5 7dccae813e94645e353c24d6e8799410
BLAKE2b-256 d1d9a156bbfa9aecc39601d940acc07cf782e4b8bff92c3ac10851fd62b92cf3

See more details on using hashes here.

File details

Details for the file nimer-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: nimer-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 29.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nimer-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 118fb4a2d99c345ac1e4ab2c110d8cc55d4a5f78d8f4bf9a2d0e0f0b7442dbac
MD5 703a6943db9ee5f24d83db81d011befc
BLAKE2b-256 838ee8f0d33b63953a8839db436f361a60c06b42d2c69ba4a91ed1a5be473a04

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page