Capability-driven AI model routing with automatic failover and free-tier aggregation
Project description
One integration point for all your AI providers.
Automatic failover, free-tier aggregation, and capability-based routing.
Your application requests a capability (e.g. "chat completion"). ModelMesh picks the best available provider, rotates on failure, and chains free quotas across providers -- all behind a standard OpenAI SDK interface.
Install
Python:
pip install modelmesh-lite # core (zero dependencies)
pip install modelmesh-lite[yaml] # + YAML config support
TypeScript / Node.js:
npm install @nistrapa/modelmesh-core
Docker Proxy (any language):
cp .env.example .env # add your API keys
docker compose up --build
# Proxy at http://localhost:8080 — speaks the OpenAI REST API
Quick Start
Set an API key and go:
export OPENAI_API_KEY="sk-..."
Python
import modelmesh
client = modelmesh.create("chat-completion")
response = client.chat.completions.create(
model="chat-completion", # virtual model name = capability pool
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
TypeScript
import { create } from "@nistrapa/modelmesh-core";
const client = create("chat-completion");
const response = await client.chat.completions.create({
model: "chat-completion",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);
How It Works
client.chat.completions.create(model="chat-completion", ...)
|
v
+-----------+ +-----------+ +----------+
| Router | --> | Pool | --> | Model | --> Provider API
+-----------+ +-----------+ +----------+
Resolves the Groups models Selects best Sends request,
capability to that can do active model handles retry
a pool the task (rotation policy) and failover
"chat-completion" resolves to a pool containing all models that support chat. The pool's rotation policy picks the best active model. If it fails, the router retries with backoff, then rotates to the next model. When a provider's free quota runs out, rotation automatically moves to the next provider.
Multi-Provider Failover
Add more API keys -- ModelMesh chains them automatically:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="AI..."
client = modelmesh.create("chat-completion")
# Inspect the providers behind the virtual model
print(client.describe())
# Pool "chat-completion" (strategy: stick-until-failure)
# capability: generation.text-generation.chat-completion
# → openai.gpt-4o [openai.llm.v1] (active)
# openai.gpt-4o-mini [openai.llm.v1] (active)
# anthropic.claude-sonnet-4 [anthropic.claude.v1] (active)
# google.gemini-2.0-flash [google.gemini.v1] (active)
Same client.chat.completions.create() call -- but now if OpenAI is down or its quota is exhausted, the request routes to Anthropic, then Gemini.
YAML Configuration
For full control, use a configuration file:
# modelmesh.yaml
providers:
openai.llm.v1:
connector: openai.llm.v1
config:
api_key: "${secrets:OPENAI_API_KEY}"
anthropic.claude.v1:
connector: anthropic.claude.v1
config:
api_key: "${secrets:ANTHROPIC_API_KEY}"
models:
openai.gpt-4o:
provider: openai.llm.v1
capabilities:
- generation.text-generation.chat-completion
anthropic.claude-sonnet-4:
provider: anthropic.claude.v1
capabilities:
- generation.text-generation.chat-completion
pools:
chat:
capability: generation.text-generation.chat-completion
strategy: stick-until-failure
client = modelmesh.create(config="modelmesh.yaml")
Key Features
| Feature | Description |
|---|---|
| OpenAI-compatible | Drop-in replacement for any OpenAI SDK client |
| Multi-provider routing | OpenAI, Anthropic, Gemini, Groq, and more |
| Automatic failover | Retry with backoff, then rotate to next model |
| Free-tier aggregation | Chain quotas across providers |
| Capability-based pools | Request tasks, not specific providers |
| 8 rotation strategies | Stick-until-failure, cost-first, latency-first, round-robin, and more |
| Pluggable connectors | Extend any integration point with the CDK |
| Zero dependencies | Core library has no external dependencies |
Documentation
| Document | Description |
|---|---|
| System Concept | Architecture, design, and full feature overview |
| Model Capabilities | Capability hierarchy tree and predefined pools |
| System Configuration | Full YAML configuration reference |
| Connector Catalogue | All pre-shipped connectors with config schemas |
| Connector Interfaces | Interface definitions for all connector types |
| System Services | Runtime objects: Router, Pool, Model, State |
| Proxy Guide | Deploy as OpenAI-compatible proxy: Docker, CLI, config, browser access |
| AI Agent Integration | Guide for AI coding agents (Claude Code, Cursor, etc.) to integrate ModelMesh |
CDK (Connector Development Kit)
| Document | Description |
|---|---|
| CDK Overview | Architecture and class hierarchy |
| Base Classes | Reference for all CDK base classes |
| Developer Guide | Tutorials: build your own connectors |
| Convenience Layer | QuickProvider and zero-config setup |
| Mixins | Cache, metrics, rate limiter, HTTP client |
Samples
| Collection | Description |
|---|---|
| Quickstart | 6 progressive examples in Python and TypeScript |
| System Integration | Multi-provider, streaming, embeddings, cost optimization |
| CDK Tutorials | Build providers, rotation policies, and more |
| Custom Connectors | Full custom connector examples for all 6 types |
| Proxy Test | Vanilla JS browser test page for the OpenAI proxy |
Development
# Clone the repository
git clone https://github.com/ApartsinProjects/ModelMesh.git
cd ModelMesh
# Run Python tests (855 tests)
pip install pytest
cd src/python && python -m pytest ../../tests/ -v
# Run TypeScript tests (511 tests)
cd src/typescript && npm install && npm test
# Or use the automation script
./scripts/test-all.sh
Docker
# Quick start with Docker Compose
cp .env.example .env # then add your API keys
docker compose up --build
# Or use the automation script
./scripts/proxy-up.sh
# Test the running proxy
curl http://localhost:8080/v1/models
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"text-generation","messages":[{"role":"user","content":"Hello!"}]}'
See the Proxy Guide for full configuration, CLI reference, and browser access.
Scripts
| Script | Description |
|---|---|
scripts/proxy-up.sh |
Build and start the Docker proxy |
scripts/proxy-down.sh |
Stop the Docker proxy |
scripts/proxy-test.sh |
Smoke-test a running proxy |
scripts/docker-build.sh |
Build the Docker image |
scripts/install-python.sh |
Install Python package (dev or prod) |
scripts/install-typescript.sh |
Install TypeScript package |
scripts/test-all.sh |
Run full test suite (Python + TypeScript) |
License
Created by Sasha Apartsin
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file modelmesh_lite-0.1.1.tar.gz.
File metadata
- Download URL: modelmesh_lite-0.1.1.tar.gz
- Upload date:
- Size: 188.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4f6fb2454ff302aab3b09c502ddfc33df7acf2b249a05c7f76a9b8477fac98e
|
|
| MD5 |
b461af1f0dff2b7f72f4e68ab7a4735d
|
|
| BLAKE2b-256 |
3ab8fb3eddb80d9b19dfb4f709e504ef2bb05a53c20b7631399b2c7f3e174ca3
|
File details
Details for the file modelmesh_lite-0.1.1-py3-none-any.whl.
File metadata
- Download URL: modelmesh_lite-0.1.1-py3-none-any.whl
- Upload date:
- Size: 183.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bb8800afe9ff168cb4d5c9081673160c1061c3a175d6578717f50c7033064391
|
|
| MD5 |
76d6d5f4620556483e7676ea1a6e50ed
|
|
| BLAKE2b-256 |
d379cb436d226af12cd3c2cdcc71050350d44f967cd3f3ca2ee3c894165d983a
|