Skip to main content

Capability-driven AI model routing with automatic failover and free-tier aggregation

Project description

ModelMesh

One integration point for all your AI providers.
Automatic failover, free-tier aggregation, and capability-based routing.

Python 3.11+ TypeScript 5.0+ Docker License Tests Documentation


Your application requests a capability (e.g. "chat completion"). ModelMesh picks the best available provider, rotates on failure, and chains free quotas across providers -- all behind a standard OpenAI SDK interface.

Install

Python:

pip install modelmesh-lite                # core (zero dependencies)
pip install modelmesh-lite[yaml]          # + YAML config support

TypeScript / Node.js:

npm install @modelmesh/core

Docker Proxy (any language):

cp .env.example .env   # add your API keys
docker compose up --build
# Proxy at http://localhost:8080 — speaks the OpenAI REST API

Quick Start

Set an API key and go:

export OPENAI_API_KEY="sk-..."

Python

import modelmesh

client = modelmesh.create("chat-completion")

response = client.chat.completions.create(
    model="chat-completion",          # virtual model name = capability pool
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

TypeScript

import { create } from "@modelmesh/core";

const client = create("chat-completion");

const response = await client.chat.completions.create({
    model: "chat-completion",
    messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

How It Works

client.chat.completions.create(model="chat-completion", ...)
       |
       v
  +-----------+     +-----------+     +----------+
  |  Router   | --> |   Pool    | --> |  Model   | --> Provider API
  +-----------+     +-----------+     +----------+
  Resolves the       Groups models     Selects best     Sends request,
  capability to      that can do       active model     handles retry
  a pool             the task          (rotation policy) and failover

"chat-completion" resolves to a pool containing all models that support chat. The pool's rotation policy picks the best active model. If it fails, the router retries with backoff, then rotates to the next model. When a provider's free quota runs out, rotation automatically moves to the next provider.

Multi-Provider Failover

Add more API keys -- ModelMesh chains them automatically:

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="AI..."
client = modelmesh.create("chat-completion")

# Inspect the providers behind the virtual model
print(client.describe())
# Pool "chat-completion" (strategy: stick-until-failure)
#   capability: generation.text-generation.chat-completion
#   → openai.gpt-4o [openai.llm.v1] (active)
#     openai.gpt-4o-mini [openai.llm.v1] (active)
#     anthropic.claude-sonnet-4 [anthropic.claude.v1] (active)
#     google.gemini-2.0-flash [google.gemini.v1] (active)

Same client.chat.completions.create() call -- but now if OpenAI is down or its quota is exhausted, the request routes to Anthropic, then Gemini.

YAML Configuration

For full control, use a configuration file:

# modelmesh.yaml
providers:
  openai.llm.v1:
    connector: openai.llm.v1
    config:
      api_key: "${secrets:OPENAI_API_KEY}"

  anthropic.claude.v1:
    connector: anthropic.claude.v1
    config:
      api_key: "${secrets:ANTHROPIC_API_KEY}"

models:
  openai.gpt-4o:
    provider: openai.llm.v1
    capabilities:
      - generation.text-generation.chat-completion

  anthropic.claude-sonnet-4:
    provider: anthropic.claude.v1
    capabilities:
      - generation.text-generation.chat-completion

pools:
  chat:
    capability: generation.text-generation.chat-completion
    strategy: stick-until-failure
client = modelmesh.create(config="modelmesh.yaml")

Key Features

Feature Description
OpenAI-compatible Drop-in replacement for any OpenAI SDK client
Multi-provider routing OpenAI, Anthropic, Gemini, Groq, and more
Automatic failover Retry with backoff, then rotate to next model
Free-tier aggregation Chain quotas across providers
Capability-based pools Request tasks, not specific providers
8 rotation strategies Stick-until-failure, cost-first, latency-first, round-robin, and more
Pluggable connectors Extend any integration point with the CDK
Zero dependencies Core library has no external dependencies

Documentation

Document Description
System Concept Architecture, design, and full feature overview
Model Capabilities Capability hierarchy tree and predefined pools
System Configuration Full YAML configuration reference
Connector Catalogue All pre-shipped connectors with config schemas
Connector Interfaces Interface definitions for all connector types
System Services Runtime objects: Router, Pool, Model, State
Proxy Guide Deploy as OpenAI-compatible proxy: Docker, CLI, config, browser access
AI Agent Integration Guide for AI coding agents (Claude Code, Cursor, etc.) to integrate ModelMesh

CDK (Connector Development Kit)

Document Description
CDK Overview Architecture and class hierarchy
Base Classes Reference for all CDK base classes
Developer Guide Tutorials: build your own connectors
Convenience Layer QuickProvider and zero-config setup
Mixins Cache, metrics, rate limiter, HTTP client

Samples

Collection Description
Quickstart 6 progressive examples in Python and TypeScript
System Integration Multi-provider, streaming, embeddings, cost optimization
CDK Tutorials Build providers, rotation policies, and more
Custom Connectors Full custom connector examples for all 6 types
Proxy Test Vanilla JS browser test page for the OpenAI proxy

Development

# Clone the repository
git clone https://github.com/ApartsinProjects/ModelMesh.git
cd ModelMesh

# Run Python tests (855 tests)
pip install pytest
cd src/python && python -m pytest ../../tests/ -v

# Run TypeScript tests (511 tests)
cd src/typescript && npm install && npm test

# Or use the automation script
./scripts/test-all.sh

Docker

# Quick start with Docker Compose
cp .env.example .env    # then add your API keys
docker compose up --build

# Or use the automation script
./scripts/proxy-up.sh

# Test the running proxy
curl http://localhost:8080/v1/models
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"text-generation","messages":[{"role":"user","content":"Hello!"}]}'

See the Proxy Guide for full configuration, CLI reference, and browser access.

Scripts

Script Description
scripts/proxy-up.sh Build and start the Docker proxy
scripts/proxy-down.sh Stop the Docker proxy
scripts/proxy-test.sh Smoke-test a running proxy
scripts/docker-build.sh Build the Docker image
scripts/install-python.sh Install Python package (dev or prod)
scripts/install-typescript.sh Install TypeScript package
scripts/test-all.sh Run full test suite (Python + TypeScript)

License

MIT


Created by Sasha Apartsin

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelmesh_lite-0.1.0.tar.gz (191.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modelmesh_lite-0.1.0-py3-none-any.whl (183.1 kB view details)

Uploaded Python 3

File details

Details for the file modelmesh_lite-0.1.0.tar.gz.

File metadata

  • Download URL: modelmesh_lite-0.1.0.tar.gz
  • Upload date:
  • Size: 191.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for modelmesh_lite-0.1.0.tar.gz
Algorithm Hash digest
SHA256 22d9ea4d384ef3eaed4251c61a229ec6478654bab69155d8d43188f8a797a2a3
MD5 dfa99487b6c820e4c5852cf8bd61da3c
BLAKE2b-256 74a420c4f0a5258b7d3ab22a050bd4b6a6e2746bd57ccfd911947bc26faa92d9

See more details on using hashes here.

File details

Details for the file modelmesh_lite-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: modelmesh_lite-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 183.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for modelmesh_lite-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed3e328184a23e755a124ceba5cd23f6e24fed82ee826259e7a23eeb94231dc9
MD5 57aa587a2448402c67cff12a84af2832
BLAKE2b-256 34ee9fbd485ec9b3bbc7c0e468109e758d99584de01061680c38183675a2de2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page