AI gateway with capability-based routing across providers + bring-your-own-key passthrough. OpenAI- and Anthropic-compatible. Drop-in for Claude Code, Cursor, Aider, Cline, Continue.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

modelmeldAI

These details have not been verified by PyPI

Project description

ModelMeld

Per-request capability routing for Claude Code, Cursor, Aider, Cline, Continue. Speaks OpenAI and Anthropic natively. Bring your own key — never stored.

Quickstart

Run the gateway yourself and route real OSS models in a couple of minutes. You bring a provider key (OpenRouter is the easiest — one key, many open-weight models, pay-as-you-go); ModelMeld picks the cheapest model that clears the quality bar per request.

pip install 'modelmeld[anthropic,openai]'
modelmeld setup --self-host

The wizard prompts for whichever provider keys you have (OpenRouter / Fireworks / Together for cloud-OSS routing, an optional Anthropic / OpenAI key for frontier escalation, or a local vLLM endpoint), enables capability routing, pre-configures Claude Code's /model picker, and — before declaring success — boots the gateway and proves a real OSS model served a request (it fails loudly rather than leaving you on a silent no-op).

Then, in two shells:

# shell 1 — run the gateway
source ~/.modelmeld/modelmeld-gateway.env
uvicorn modelmeld.api.server:app --host 0.0.0.0 --port 8080

# shell 2 — point Claude Code at it
source ~/.modelmeld/setup-claude-code.sh
claude

In the Claude Code TUI, type /model → pick ModelMeld — Auto (or Saver / Quality). No provider key yet? modelmeld setup --self-host --demo points you at the cheapest one-key on-ramp. Diagnose anytime with modelmeld doctor.

Hosted (managed) gateway — point your tool at our URL and skip running anything yourself. Currently invite-only beta: request access. Once you have a key, modelmeld setup --tool claude-code configures the hosted path.

Anthropic prompt caching, end-to-end

Anthropic prompt caching survives end-to-end through ModelMeld:

cache_control round-trip — first call writes 4933 tokens to cache; second call reads them back at 90% discount

Same payload sent twice, seconds apart. The second call hits Anthropic's prompt cache for 4933 tokens at 10% of normal rate — a 90% input-token discount, preserved through the gateway. Many other gateways strip cache_control markers; ModelMeld passes them through verbatim.

Why ModelMeld

You're paying frontier-model prices on every request — including the ones where a coding-tuned 7B model would produce identical output. Most gateways force a global choice ("use Anthropic" / "use OpenAI" / "use local"). ModelMeld picks per request, driven by a benchmark-weighted capability scout that knows which model class is sufficient for which task category.

How it compares

Capability	ModelMeld	Typical gateway
Anthropic `cache_control` preserved end-to-end	✅	⚠️ Many strip it
Speaks `/v1/chat/completions` AND `/v1/messages` natively	✅	Usually OpenAI shape only
Audit headers expose the routing decision to the caller	✅ (`x-modelmeld-routed-to`, `x-modelmeld-routed-model`, etc.)	Usually opaque
BYOK — keys never stored at rest	✅	Varies
Honest non-coverage list (what doesn't work yet)	✅	Rare

The differentiator isn't that we route. It's that we don't break the features upstream providers built into their APIs, and we tell you exactly what the gateway did with your request.

Three policies, three behaviors

Pick the policy that matches your work mode. They show up in any tool's /model picker as three options:

anthropic/modelmeld-saver — OSS-only. Never escalates to frontier. Predictable cost ceiling — you pay OSS-tier rates regardless of request complexity.
anthropic/modelmeld-auto — OSS by default; escalates to frontier (Sonnet/Opus) when the prompt contains 2+ reasoning markers ("think step by step", "explain your reasoning", etc.).
anthropic/modelmeld-quality — Frontier-first. Downgrades to OSS only on detected trivial work (autocomplete-shape, background calls).

Frontier-tier routing uses BYOK. Your Anthropic / OpenAI key is sent as a per-request header (x-modelmeld-byok-anthropic: sk-ant-…), used to make the upstream call, then forgotten. Never stored at rest, never logged.

Works with

Drop-in for any tool that speaks OpenAI Chat Completions, OpenAI Responses, or Anthropic Messages.

Validated end-to-end: Claude Code · Codex CLI · opencode · Aider · AutoGen · CrewAI · LangGraph · OpenClaw · OpenAI SDK · anthropic-sdk-python · @anthropic-ai/sdk

Should work, not yet live-tested: Cursor · Cline · Continue

Frameworks can declare task category + agent role explicitly via x-modelmeld-task-category / x-modelmeld-agent-role headers — bypasses the classifier when your harness already knows what kind of work the request represents. See routing hints.

What doesn't work yet

Honest non-coverage list for the v1 OSS API surface:

Anthropic image content blocks (vision input) — deferred. Claude Code doesn't use vision; documented as a known gap rather than silently failing.

What's in the package

Three API surfaces, one routing pipeline. OpenAI-compatible at /v1/chat/completions (drop-in for any OpenAI-wire-format client), the OpenAI Responses API at /v1/responses (drop-in for Codex CLI), and Anthropic-compatible at /v1/messages (drop-in for Claude Code, anthropic-sdk-python, @anthropic-ai/sdk). All three stream via SSE, share the same router / memory / cache pipeline, and emit identical x-modelmeld-* audit headers.
Provider adapters — OpenAI, Anthropic (with full schema translation in both directions), vLLM, TensorRT-LLM. Each adapter retries transient errors (429 / 5xx / network blip) with exponential backoff.
Capability-based routing — CapabilityScout picks the cheapest model meeting a quality threshold for the prompt's task category, driven by the ModelRegistry.
Completion cache — exact-match (in-memory or Redis) + semantic (Qdrant); cache key pools across users routed to the same served model.
PII scrubbing — runs on every egress path before cloud upload.
Production-tuned defaults — full dev-tool detection catalog and a current default_registry.json snapshot ship as the defaults. All tunable via constructor args.

Self-host (manual config)

The Quickstart wizard is the recommended path — it writes the config below for you and verifies real routing. To wire it by hand:

The gateway reads MODELMELD_-prefixed env vars. Two settings are required for real routing — without them the gateway falls back to a no-op stub adapter that returns canned replies to every request:

pip install 'modelmeld[anthropic,openai]'

# 1. Capability routing (the scout picks a model per request).
export MODELMELD_ROUTING_POLICY=capability
# 2. At least one provider key (note the MODELMELD_ prefix):
export MODELMELD_OPENROUTER_API_KEY=sk-or-…   # cloud-OSS routing
# Optional frontier escalation for -auto / -quality:
export MODELMELD_ANTHROPIC_API_KEY=sk-ant-…
export MODELMELD_OPENAI_API_KEY=sk-…

uvicorn modelmeld.api.server:app --host 0.0.0.0 --port 8080

Then point your tool at http://localhost:8080. In self-host you supply upstream keys to the gateway via these env vars (not per-request BYOK headers — those are for the hosted multi-tenant path).

For routing across local vLLM + cloud providers, see docs/backends.md. For full self-host operational notes (TLS, scaling, observability), see docs/self-host.md.

Licensing — TL;DR

Code: AGPL-3.0-or-later. Use, modify, redistribute. Calling the gateway over HTTP from unmodified clients (Cursor, Aider, Claude Code, etc.) does NOT make those clients AGPL. Running a modified gateway as a service for third parties does require your modifications to also be AGPL.
Bundled snapshot data (scout/data/default_registry.json): CC-BY-4.0 with attribution. Use the snapshot scores anywhere.
Live curated registry feed (feed.modelmeld.ai): subscription product. Continuously updated, editorially weighted across multiple benchmark sources.

If you pip install modelmeld and never subscribe, everything works — you just route on a snapshot of benchmark data taken at OSS release date. Over ~6 months that snapshot stales relative to the current best-cost frontier; the feed is what keeps routing decisions sharp.

Full rationale, boundary contract, and commercial-licensing options: docs/license-rationale.md, docs/open-core-boundary.md, docs/registry-feed.md. Or email hello@modelmeld.ai.

Status

Production-ready for the routes documented here. Pre-1.0 on SemVer guarantees for the public Python API — see docs/api-stability.md for which symbols carry compatibility commitments and which are subject to change.

The HTTP surfaces (/v1/chat/completions, /v1/messages, the x-modelmeld-* audit headers) are stable in spirit; we won't break existing integrations without a major-version bump and a deprecation window.

Contributing

PRs welcome. See CONTRIBUTING.md for the dev workflow, code style (ruff format + ruff check + pyright), and DCO commit-signoff requirement. Issues labeled good first issue are intentionally scoped for first-time contributors.

We do not accept PRs that modify the bundled snapshot data files (scout/data/) — those are curated centrally for the live feed. File issues against bad routing decisions you observe and we'll evaluate adjustments for the next feed release.

Community

GitHub Issues — bugs + feature requests (after reading CONTRIBUTING.md)
GitHub Discussions — questions, ideas, integration help
Security — see SECURITY.md; report to security@modelmeld.ai (90-day disclosure window)

Enterprise tier

For production deployments needing API-key auth, RBAC, OIDC SSO, Postgres-backed SOC2-grade audit logs, encryption-at-rest, per-tenant rate limiting, FinOps dashboards, multi-tenant Qdrant cache, or the managed hosted tier — email hello@modelmeld.ai.

License

Code: AGPL-3.0-or-later (see LICENSE, NOTICE, docs/license-rationale.md)
Data files: CC-BY-4.0 (see scout/data/LICENSE.md)
Live feed: subscription terms (see NOTICE and docs/registry-feed.md)

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

modelmeldAI

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.18.0

Jun 23, 2026

0.17.0

Jun 13, 2026

0.16.1

Jun 12, 2026

0.16.0

Jun 12, 2026

0.15.0

Jun 10, 2026

0.14.0

Jun 10, 2026

0.13.0

Jun 9, 2026

0.12.0

Jun 8, 2026

0.11.0

Jun 7, 2026

0.10.3

Jun 7, 2026

0.10.2

Jun 7, 2026

0.10.1

Jun 7, 2026

0.10.0

Jun 7, 2026

0.9.0

Jun 6, 2026

0.8.2

Jun 6, 2026

0.8.1

Jun 6, 2026

0.8.0

Jun 6, 2026

0.7.5

Jun 6, 2026

0.7.4

Jun 4, 2026

0.7.3

Jun 3, 2026

0.7.2

Jun 2, 2026

0.7.1

Jun 2, 2026

0.7.0

Jun 1, 2026

0.6.3

Jun 1, 2026

0.6.1

May 31, 2026

0.6.0

May 31, 2026

0.5.0

May 31, 2026

0.4.0

May 31, 2026

0.3.1

May 31, 2026

0.3.0

May 31, 2026

0.2.0

May 31, 2026

0.1.3

May 26, 2026

0.1.2

May 26, 2026

0.1.1

May 26, 2026

0.1.0

May 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelmeld-0.18.0.tar.gz (219.7 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

modelmeld-0.18.0-py3-none-any.whl (272.5 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file modelmeld-0.18.0.tar.gz.

File metadata

Download URL: modelmeld-0.18.0.tar.gz
Upload date: Jun 23, 2026
Size: 219.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for modelmeld-0.18.0.tar.gz
Algorithm	Hash digest
SHA256	`1ed66a6f97eadbb81878d38d08fe54d573614b059ea8f4d5d8025dd3d15b3338`
MD5	`1d1fda7aa4528e680500489a1b29e73f`
BLAKE2b-256	`9df1f683dcde5b146a9744660d1601617ef4d2595b4bddbf86030aa01c930c87`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelmeld-0.18.0.tar.gz:

Publisher: release.yml on modelmeld/modelmeld

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelmeld-0.18.0.tar.gz
- Subject digest: 1ed66a6f97eadbb81878d38d08fe54d573614b059ea8f4d5d8025dd3d15b3338
- Sigstore transparency entry: 1921437257
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: modelmeld/modelmeld@188fe9105d0d3f3ceeaa6a4684537a6379d29ab1
- Branch / Tag: refs/tags/v0.18.0
- Owner: https://github.com/modelmeld
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@188fe9105d0d3f3ceeaa6a4684537a6379d29ab1
- Trigger Event: push

File details

Details for the file modelmeld-0.18.0-py3-none-any.whl.

File metadata

Download URL: modelmeld-0.18.0-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 272.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for modelmeld-0.18.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ab1336e770240cc8ce829480a2bdc59dd29e8ee0d8b932e9da459a490d615d93`
MD5	`9058bd27510dd58d0d3d9ac1359b5630`
BLAKE2b-256	`d0cb1fcdec1e75c9a87cdc361cd96903c6ee9af7dbb763288862b2532f34a15f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelmeld-0.18.0-py3-none-any.whl:

Publisher: release.yml on modelmeld/modelmeld

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelmeld-0.18.0-py3-none-any.whl
- Subject digest: ab1336e770240cc8ce829480a2bdc59dd29e8ee0d8b932e9da459a490d615d93
- Sigstore transparency entry: 1921437441
- Sigstore integration time: Jun 23, 2026
Source repository:
- Permalink: modelmeld/modelmeld@188fe9105d0d3f3ceeaa6a4684537a6379d29ab1
- Branch / Tag: refs/tags/v0.18.0
- Owner: https://github.com/modelmeld
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@188fe9105d0d3f3ceeaa6a4684537a6379d29ab1
- Trigger Event: push

modelmeld 0.18.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

ModelMeld

Quickstart

Anthropic prompt caching, end-to-end

Why ModelMeld

How it compares

Three policies, three behaviors

Works with

What doesn't work yet

What's in the package

Self-host (manual config)

Licensing — TL;DR

Status

Contributing

Community

Enterprise tier

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance