Skip to main content

Capability-driven AI model routing with automatic failover and free-tier aggregation

Project description

ModelMesh

One integration point for all your AI providers.
Automatic failover, free-tier aggregation, and capability-based routing.

Python 3.11+ TypeScript 5.0+ Docker License Tests Documentation


Your application requests a capability (e.g. "chat completion"). ModelMesh picks the best available provider, rotates on failure, and chains free quotas across providers -- all behind a standard OpenAI SDK interface.

Install

Python:

pip install modelmesh-lite                # core (zero dependencies)
pip install modelmesh-lite[yaml]          # + YAML config support

TypeScript / Node.js:

npm install @nistrapa/modelmesh-core

Docker Proxy (any language):

cp .env.example .env   # add your API keys
docker compose up --build
# Proxy at http://localhost:8080 — speaks the OpenAI REST API

Quick Start

Set an API key and go:

export OPENAI_API_KEY="sk-..."

Python

import modelmesh

client = modelmesh.create("chat-completion")

response = client.chat.completions.create(
    model="chat-completion",          # virtual model name = capability pool
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

TypeScript

import { create } from "@nistrapa/modelmesh-core";

const client = create("chat-completion");

const response = await client.chat.completions.create({
    model: "chat-completion",
    messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

How It Works

client.chat.completions.create(model="chat-completion", ...)
       |
       v
  +-----------+     +-----------+     +----------+
  |  Router   | --> |   Pool    | --> |  Model   | --> Provider API
  +-----------+     +-----------+     +----------+
  Resolves the       Groups models     Selects best     Sends request,
  capability to      that can do       active model     handles retry
  a pool             the task          (rotation policy) and failover

"chat-completion" resolves to a pool containing all models that support chat. The pool's rotation policy picks the best active model. If it fails, the router retries with backoff, then rotates to the next model. When a provider's free quota runs out, rotation automatically moves to the next provider.

Multi-Provider Failover

Add more API keys -- ModelMesh chains them automatically:

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="AI..."
client = modelmesh.create("chat-completion")

# Inspect the providers behind the virtual model
print(client.describe())
# Pool "chat-completion" (strategy: stick-until-failure)
#   capability: generation.text-generation.chat-completion
#   → openai.gpt-4o [openai.llm.v1] (active)
#     openai.gpt-4o-mini [openai.llm.v1] (active)
#     anthropic.claude-sonnet-4 [anthropic.claude.v1] (active)
#     google.gemini-2.0-flash [google.gemini.v1] (active)

Same client.chat.completions.create() call -- but now if OpenAI is down or its quota is exhausted, the request routes to Anthropic, then Gemini.

YAML Configuration

For full control, use a configuration file:

# modelmesh.yaml
providers:
  openai.llm.v1:
    connector: openai.llm.v1
    config:
      api_key: "${secrets:OPENAI_API_KEY}"

  anthropic.claude.v1:
    connector: anthropic.claude.v1
    config:
      api_key: "${secrets:ANTHROPIC_API_KEY}"

models:
  openai.gpt-4o:
    provider: openai.llm.v1
    capabilities:
      - generation.text-generation.chat-completion

  anthropic.claude-sonnet-4:
    provider: anthropic.claude.v1
    capabilities:
      - generation.text-generation.chat-completion

pools:
  chat:
    capability: generation.text-generation.chat-completion
    strategy: stick-until-failure
client = modelmesh.create(config="modelmesh.yaml")

Key Features

Feature Description
OpenAI-compatible Drop-in replacement for any OpenAI SDK client
Multi-provider routing OpenAI, Anthropic, Gemini, Groq, and more
Automatic failover Retry with backoff, then rotate to next model
Free-tier aggregation Chain quotas across providers
Capability-based pools Request tasks, not specific providers
8 rotation strategies Stick-until-failure, cost-first, latency-first, round-robin, and more
Pluggable connectors Extend any integration point with the CDK
Zero dependencies Core library has no external dependencies

Documentation

Document Description
System Concept Architecture, design, and full feature overview
Model Capabilities Capability hierarchy tree and predefined pools
System Configuration Full YAML configuration reference
Connector Catalogue All pre-shipped connectors with config schemas
Connector Interfaces Interface definitions for all connector types
System Services Runtime objects: Router, Pool, Model, State
Proxy Guide Deploy as OpenAI-compatible proxy: Docker, CLI, config, browser access
AI Agent Integration Guide for AI coding agents (Claude Code, Cursor, etc.) to integrate ModelMesh

CDK (Connector Development Kit)

Document Description
CDK Overview Architecture and class hierarchy
Base Classes Reference for all CDK base classes
Developer Guide Tutorials: build your own connectors
Convenience Layer QuickProvider and zero-config setup
Mixins Cache, metrics, rate limiter, HTTP client

Samples

Collection Description
Quickstart 6 progressive examples in Python and TypeScript
System Integration Multi-provider, streaming, embeddings, cost optimization
CDK Tutorials Build providers, rotation policies, and more
Custom Connectors Full custom connector examples for all 6 types
Proxy Test Vanilla JS browser test page for the OpenAI proxy

Development

# Clone the repository
git clone https://github.com/ApartsinProjects/ModelMesh.git
cd ModelMesh

# Run Python tests (855 tests)
pip install pytest
cd src/python && python -m pytest ../../tests/ -v

# Run TypeScript tests (511 tests)
cd src/typescript && npm install && npm test

# Or use the automation script
./scripts/test-all.sh

Docker

# Quick start with Docker Compose
cp .env.example .env    # then add your API keys
docker compose up --build

# Or use the automation script
./scripts/proxy-up.sh

# Test the running proxy
curl http://localhost:8080/v1/models
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"text-generation","messages":[{"role":"user","content":"Hello!"}]}'

See the Proxy Guide for full configuration, CLI reference, and browser access.

Scripts

Script Description
scripts/proxy-up.sh Build and start the Docker proxy
scripts/proxy-down.sh Stop the Docker proxy
scripts/proxy-test.sh Smoke-test a running proxy
scripts/docker-build.sh Build the Docker image
scripts/install-python.sh Install Python package (dev or prod)
scripts/install-typescript.sh Install TypeScript package
scripts/test-all.sh Run full test suite (Python + TypeScript)

License

MIT


Created by Sasha Apartsin

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelmesh_lite-0.1.1.tar.gz (188.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modelmesh_lite-0.1.1-py3-none-any.whl (183.6 kB view details)

Uploaded Python 3

File details

Details for the file modelmesh_lite-0.1.1.tar.gz.

File metadata

  • Download URL: modelmesh_lite-0.1.1.tar.gz
  • Upload date:
  • Size: 188.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for modelmesh_lite-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a4f6fb2454ff302aab3b09c502ddfc33df7acf2b249a05c7f76a9b8477fac98e
MD5 b461af1f0dff2b7f72f4e68ab7a4735d
BLAKE2b-256 3ab8fb3eddb80d9b19dfb4f709e504ef2bb05a53c20b7631399b2c7f3e174ca3

See more details on using hashes here.

File details

Details for the file modelmesh_lite-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: modelmesh_lite-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 183.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for modelmesh_lite-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bb8800afe9ff168cb4d5c9081673160c1061c3a175d6578717f50c7033064391
MD5 76d6d5f4620556483e7676ea1a6e50ed
BLAKE2b-256 d379cb436d226af12cd3c2cdcc71050350d44f967cd3f3ca2ee3c894165d983a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page