Skip to main content

The official Python SDK for the OpenModex AI Gateway API

Project description

OpenModex Python SDK

The official Python SDK for the OpenModex AI Gateway API. Access 100+ LLM models from OpenAI, Anthropic, Google, DeepSeek, Mistral, and Qwen through a single unified API with intelligent routing, automatic fallbacks, and built-in cost tracking.

Features

  • Unified API -- One client for all major LLM providers
  • Smart Routing -- Automatic model selection optimized for cost, latency, or quality
  • Client-Side Fallbacks -- Automatic retry with backup models on failure
  • Streaming -- First-class SSE streaming with sync and async iterators
  • Async Support -- Full asyncio support via httpx
  • Lightweight -- Zero dependencies beyond httpx (no Pydantic required)
  • Type Safe -- Fully typed with dataclasses and type hints
  • OpenAI Compatible -- Drop-in replacement by changing base_url

Requirements

  • Python 3.8+

Installation

pip install openmodex

Quick Start

import os
from openmodex import OpenModex

client = OpenModex(api_key=os.environ["OPENMODEX_API_KEY"])

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "What is OpenModex?"},
    ],
)

print(response.choices[0].message.content)

Usage

Chat Completions

response = client.chat.completions.create(
    model="claude-3-5-sonnet",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."},
    ],
    temperature=0.7,
    max_tokens=1000,
)

print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Write a short story."},
    ],
    stream=True,
)

with stream:
    for chunk in stream:
        if chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)

Async Usage

import asyncio
from openmodex import AsyncOpenModex

async def main():
    client = AsyncOpenModex(api_key="omx_sk_...")

    # Non-streaming
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

    # Streaming
    stream = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}],
        stream=True,
    )
    async with stream:
        async for chunk in stream:
            if chunk.choices[0].delta.content:
                print(chunk.choices[0].delta.content, end="")

    await client.close()

asyncio.run(main())

Smart Routing (OpenModex Extension)

Let the gateway pick the best model or optimize for cost/latency:

from openmodex import OpenModex, MODEL_AUTO, MODEL_CHEAPEST

# Use model aliases
response = client.chat.completions.create(
    model=MODEL_AUTO,  # "@auto" -- balanced selection
    # model=MODEL_CHEAPEST,  # "@cheapest" -- lowest cost
    messages=[{"role": "user", "content": "Hello!"}],
)

# Or configure routing strategy per-request
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    routing={"strategy": "cost_optimized", "allow_upgrade": True},
)

OpenModex Metadata

Every response includes OpenModex-specific metadata:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

if response.openmodex:
    print(f"Provider: {response.openmodex.provider}")
    print(f"Model used: {response.openmodex.model_used}")
    print(f"Cache hit: {response.openmodex.cache_hit}")
    print(f"Routing: {response.openmodex.routing_strategy}")
    print(f"Latency: {response.openmodex.latency_ms}ms")
    print(f"Request ID: {response.openmodex.request_id}")

Cache Control

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is 2+2?"}],
    cache={"enabled": True, "ttl": 3600},
)

Client-Side Fallbacks

Automatically retry with backup models on failure:

client = OpenModex(
    api_key="omx_sk_...",
    fallback_models=["gpt-4o", "claude-3-5-sonnet", "gemini-1.5-pro"],
)

# If gpt-4o fails (5xx/timeout), automatically tries the next model
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Embeddings

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="The quick brown fox jumps over the lazy dog.",
)

print(f"Dimensions: {len(response.data[0].embedding)}")

Models

# List all available models
models = client.models.list()
for m in models.data:
    print(f"{m.id} ({m.provider})")

# Get a specific model
model = client.models.retrieve("openai/gpt-4o")
print(f"{model.name}: {model.description}")

# Compare models side by side
comparison = client.models.compare(["openai/gpt-4o", "anthropic/claude-3-5-sonnet"])
print(f"Cheapest: {comparison.highlights.cheapest}")
print(f"Best quality: {comparison.highlights.best_quality}")

Legacy Completions

response = client.completions.create(
    model="gpt-3.5-turbo-instruct",
    prompt="Once upon a time",
    max_tokens=100,
)

print(response.choices[0].text)

Error Handling

from openmodex import OpenModex, APIError, AllFallbacksFailedError

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}],
    )
except APIError as e:
    print(f"API error: {e.message} (status: {e.status_code}, code: {e.code})")
    if e.is_rate_limited:
        print("Rate limited -- back off and retry")
    if e.is_auth_error:
        print("Check your API key")
except AllFallbacksFailedError:
    print("All fallback models failed")

Configuration

Parameter Description Default
api_key Your OpenModex API key OPENMODEX_API_KEY env var
base_url API base URL https://api.openmodex.com/v1
timeout Request timeout (seconds) 30.0
max_retries Max retry attempts on transient errors 2
default_headers Headers sent with every request {}
default_model Default model when none specified None
fallback_models Ordered fallback model chain []
http_client Custom httpx.Client / httpx.AsyncClient Auto-created

OpenAI SDK Compatibility

OpenModex Gateway supports drop-in compatibility. If you are already using the OpenAI Python SDK, you can route through OpenModex by changing just the base_url:

from openai import OpenAI

# Just change the base URL and API key
client = OpenAI(
    base_url="https://api.openmodex.com/compat/openai/v1",
    api_key="omx_sk_live_...",
)

# Everything else works exactly the same
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Examples

See the examples/ directory for runnable examples:

Version

import openmodex
print(openmodex.__version__)  # "0.1.0"

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openmodex_sdk-0.1.0.tar.gz (15.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

openmodex_sdk-0.1.0-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file openmodex_sdk-0.1.0.tar.gz.

File metadata

  • Download URL: openmodex_sdk-0.1.0.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for openmodex_sdk-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8b01c1ca3457769744dba1f9578e054bd44990ae5b23ed42e27c983a4f48c6fa
MD5 853cda038899380f931c409f922dfe67
BLAKE2b-256 bef986d4bb4678320c5d512762e176c01e778d79c113a47db5cdb70d17978682

See more details on using hashes here.

File details

Details for the file openmodex_sdk-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: openmodex_sdk-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for openmodex_sdk-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 42aaae80711a13d5d3a2fb6f820a3bdd56b52680b31807431c59d02cc1675820
MD5 497091093da6ff74329cfb453ed9c921
BLAKE2b-256 c321819e645f9b90bec237cf86c75cf0aa8ebd614070b1247f8990118b3c7812

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page