OpenRouter-compatible LLM router with unified batch support. Route requests across OpenAI, Anthropic, and Google with a single API.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

chriswelker

These details have not been verified by PyPI

Project description

anymodel

OpenRouter-compatible LLM router with unified batch support for Python. Self-hosted, zero fees.

Route requests across OpenAI, Anthropic, and Google with a single API. Add any OpenAI-compatible provider. Run as an SDK or standalone HTTP server.

Install

pip install anymodel

Quick Start

Set your API keys as environment variables:

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GOOGLE_API_KEY=AIza...

SDK Usage

import asyncio
from anymodel import AnyModel

async def main():
    client = AnyModel()

    response = await client.chat.completions.create(
        model="anthropic/claude-sonnet-4-6",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response["choices"][0]["message"]["content"])

asyncio.run(main())

Streaming

stream = await client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Write a haiku"}],
    stream=True,
)

async for chunk in stream:
    content = chunk["choices"][0].get("delta", {}).get("content", "")
    print(content, end="", flush=True)

Supported Providers

Set the env var and go. Models are auto-discovered from each provider's API.

Provider	Env Var	Example Model
OpenAI	`OPENAI_API_KEY`	`openai/gpt-4o`
Anthropic	`ANTHROPIC_API_KEY`	`anthropic/claude-sonnet-4-6`
Google	`GOOGLE_API_KEY`	`google/gemini-2.5-pro`
Mistral	`MISTRAL_API_KEY`	`mistral/mistral-large-latest`
Groq	`GROQ_API_KEY`	`groq/llama-3.3-70b-versatile`
DeepSeek	`DEEPSEEK_API_KEY`	`deepseek/deepseek-chat`
xAI	`XAI_API_KEY`	`xai/grok-3`
Together	`TOGETHER_API_KEY`	`together/meta-llama/Llama-3.3-70B-Instruct-Turbo`
Fireworks	`FIREWORKS_API_KEY`	`fireworks/accounts/fireworks/models/llama-v3p3-70b-instruct`
Perplexity	`PERPLEXITY_API_KEY`	`perplexity/sonar-pro`
Ollama	`OLLAMA_BASE_URL`	`ollama/llama3.3`

Flex Pricing (OpenAI)

Get 50% off OpenAI requests with flexible latency:

response = await client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    service_tier="flex",
)

Fallback Routing

Try multiple models in order. If one fails, the next is attempted:

response = await client.chat.completions.create(
    model="",
    models=[
        "anthropic/claude-sonnet-4-6",
        "openai/gpt-4o",
        "google/gemini-2.5-pro",
    ],
    route="fallback",
    messages=[{"role": "user", "content": "Hello"}],
)

Tool Calling

response = await client.chat.completions.create(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What's the weather in NYC?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {"location": {"type": "string"}},
                "required": ["location"],
            },
        },
    }],
    tool_choice="auto",
)

for call in response["choices"][0]["message"].get("tool_calls", []):
    print(call["function"]["name"], call["function"]["arguments"])

Batch Processing

Process many requests with native provider batch APIs or concurrent fallback. OpenAI, Anthropic, and Google batches are processed server-side — OpenAI at 50% cost, Anthropic with async processing for up to 10K requests, Google at 50% cost via batchGenerateContent. Other providers fall back to concurrent execution automatically.

Submit and wait

results = await client.batches.create_and_poll({
    "model": "openai/gpt-4o-mini",
    "requests": [
        {"custom_id": "req-1", "messages": [{"role": "user", "content": "Summarize AI"}]},
        {"custom_id": "req-2", "messages": [{"role": "user", "content": "Summarize ML"}]},
    ],
})

for result in results["results"]:
    print(result["custom_id"], result["response"]["choices"][0]["message"]["content"])

Submit now, check later

# Submit and get the batch ID
batch = await client.batches.create({
    "model": "anthropic/claude-haiku-4-5",
    "requests": [
        {"custom_id": "req-1", "messages": [{"role": "user", "content": "Summarize AI"}]},
    ],
})
print(batch["id"])  # "batch-abc123"

# Check status any time
status = await client.batches.get("batch-abc123")
print(status["status"])  # "pending", "processing", "completed"

# Wait for results when ready
results = await client.batches.poll("batch-abc123")

# List all batches
all_batches = await client.batches.list()

# Cancel a batch
await client.batches.cancel("batch-abc123")

Automatic max_tokens

When max_tokens isn't set on a batch request, anymodel automatically calculates a safe value per-request based on the estimated input size and the model's context window. This prevents truncated responses and context overflow errors without requiring you to hand-tune each request in a large batch.

Concurrent batch requests are streamed from disk — only N requests (default 5) are in-flight at a time, making 10K+ request batches safe without memory spikes.

Batch configuration

client = AnyModel({
    "batch": {
        "poll_interval": 10.0,          # default poll interval in seconds
        "concurrency_fallback": 10,      # concurrent request limit for non-native providers
    },
    "io": {
        "read_concurrency": 30,          # concurrent file reads (default: 20)
        "write_concurrency": 15,         # concurrent file writes (default: 10)
    },
})

Configuration

client = AnyModel({
    "anthropic": {"api_key": "sk-ant-..."},
    "openai": {"api_key": "sk-..."},
    "aliases": {
        "default": "anthropic/claude-sonnet-4-6",
        "fast": "anthropic/claude-haiku-4-5",
        "smart": "anthropic/claude-opus-4-6",
    },
    "defaults": {
        "temperature": 0.7,
        "max_tokens": 4096,
        "retries": 2,
        "timeout": 120,  # HTTP timeout in seconds (default: 120 = 2 min, flex: 600 = 10 min)
    },
})

# Use aliases as model names
response = await client.chat.completions.create(
    model="fast",
    messages=[{"role": "user", "content": "Quick answer"}],
)

Config File

Create anymodel.config.json in your project root:

{
  "anthropic": {
    "api_key": "${ANTHROPIC_API_KEY}"
  },
  "aliases": {
    "default": "anthropic/claude-sonnet-4-6"
  },
  "defaults": {
    "temperature": 0.7,
    "max_tokens": 4096
  }
}

${ENV_VAR} references are interpolated from environment variables.

Custom Providers

Add any OpenAI-compatible endpoint:

client = AnyModel({
    "custom": {
        "ollama": {
            "base_url": "http://localhost:11434/v1",
            "models": ["llama3.3", "mistral"],
        },
    },
})

response = await client.chat.completions.create(
    model="ollama/llama3.3",
    messages=[{"role": "user", "content": "Hello from Ollama"}],
)

Server Mode

Run as a standalone HTTP server compatible with the OpenAI SDK:

pip install anymodel[server]
anymodel serve --port 4141

Then point any OpenAI-compatible client at it:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:4141/api/v1", api_key="unused")
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello via server"}],
)

Also Available

Node.js: @probeo/anymodel on npm
Go: anymodel-go

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

chriswelker

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.6.1

Mar 31, 2026

0.6.0

Mar 31, 2026

0.5.1

Mar 26, 2026

0.5.0

Mar 24, 2026

0.4.0

Mar 19, 2026

0.3.3

Mar 17, 2026

This version

0.3.2

Mar 17, 2026

0.3.1

Mar 17, 2026

0.3.0

Mar 17, 2026

0.2.0

Mar 16, 2026

0.1.0

Mar 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anymodel_py-0.3.2.tar.gz (39.7 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

anymodel_py-0.3.2-py3-none-any.whl (53.7 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file anymodel_py-0.3.2.tar.gz.

File metadata

Download URL: anymodel_py-0.3.2.tar.gz
Upload date: Mar 17, 2026
Size: 39.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for anymodel_py-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`7980abf33f0926da2dadd3da21f1d2bb17f3d30ad738fec4583fd4d4a043055e`
MD5	`2bd27a02a492be05dbec650b46d1ab10`
BLAKE2b-256	`b1152bde8a85704152c71e5afa2cd28a78cbc53fce82720f6e8bb2d2429ae850`

See more details on using hashes here.

File details

Details for the file anymodel_py-0.3.2-py3-none-any.whl.

File metadata

Download URL: anymodel_py-0.3.2-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 53.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for anymodel_py-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8dab4ef21ab93a89cdd49554bb8a6b6960794abcf96741ce2e8e0bf4b7522d81`
MD5	`f39f39362e215e3017b0def2ba069953`
BLAKE2b-256	`352dec8848cdc75d850b8aac78d86bec436990ff56ebd77af02cf9df6304c8c0`

See more details on using hashes here.

anymodel-py 0.3.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

anymodel

Install

Quick Start

SDK Usage

Streaming

Supported Providers

Flex Pricing (OpenAI)

Fallback Routing

Tool Calling

Batch Processing

Submit and wait

Submit now, check later

Automatic max_tokens

Batch configuration

Configuration

Config File

Custom Providers

Server Mode

Also Available

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes