Python SDK for CMDOP LLM Service - OpenAI-compatible API for 200+ AI models
Project description
CMDOP LLM Python SDK
Python SDK for CMDOP LLM Service - OpenAI-compatible API for 200+ AI models.
Installation
pip install cmdop-llm
Quick Start
from cmdop_llm import CmdopLLM
client = CmdopLLM(api_key="your-api-key")
response = client.chat.completions.create(
model="anthropic/claude-sonnet-4",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
Features
- Drop-in OpenAI replacement - Same API, different models
- 200+ Models - GPT-4, Claude, Llama, Mistral, Gemini via single endpoint
- Streaming - Real-time token streaming
- Tool Calling - Function calling support
- Structured Output - Parse responses to Pydantic models
- Embeddings - Text embeddings generation
- Vision & OCR - Image analysis and text extraction (auto model selection)
- Image Generation - FLUX, Gemini and other models (auto model selection)
- Web Search - AI-powered web search with citations
- URL Fetch - Fetch and analyze any URL content
- Models API - List, filter, and get pricing for all available models
- Async Support - Full async/await support
Environment Variables
export CMDOP_API_KEY="your-api-key"
export CMDOP_BASE_URL="https://llm.cmdop.com/v1" # Optional, default
Usage Examples
Chat Completion
from cmdop_llm import CmdopLLM
client = CmdopLLM()
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing."}
],
temperature=0.7,
max_tokens=1000,
)
print(response.choices[0].message.content)
Streaming
stream = client.chat.completions.create(
model="anthropic/claude-sonnet-4",
messages=[{"role": "user", "content": "Write a poem."}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Tool Calling
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}]
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools,
tool_choice="auto",
)
if response.choices[0].message.tool_calls:
tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Arguments: {tool_call.function.arguments}")
Web Search
from cmdop_llm import CmdopLLM, UserLocation
client = CmdopLLM()
# Basic web search
result = client.search.web("What is the capital of France?")
print(result.content)
# Print citations
for citation in result.citations:
print(f"- {citation.title}: {citation.url}")
# Search with options
result = client.search.web(
"Latest AI news",
max_searches=5,
allowed_domains=["bbc.com", "cnn.com", "reuters.com"],
user_location=UserLocation(country="US", city="New York"),
)
print(result.content)
URL Fetch & Analysis
# Fetch and analyze a specific URL
result = client.search.fetch(
url="https://en.wikipedia.org/wiki/Python_(programming_language)",
prompt="What are the key features of Python? List top 5.",
)
print(result.content)
Vision Analysis
# With explicit model
result = client.vision.analyze(
image_url="https://example.com/image.jpg",
prompt="Describe this image",
model="google/gemini-2.0-flash-001"
)
print(result.description)
# Auto model selection (cheapest vision model)
result = client.vision.analyze(
image_url="https://example.com/image.jpg",
prompt="What's in this image?"
)
print(result.extracted_text)
OCR Text Extraction
# Auto model selection (cheapest vision-capable model)
result = client.ocr.extract(
image_url="https://example.com/document.png"
)
print(result.text)
# With explicit model
result = client.ocr.extract(
image_url="https://example.com/document.png",
model="openai/gpt-4o-mini"
)
print(result.text)
Image Generation
# Auto model selection (cheapest image generation model)
response = client.images.generate(
prompt="A futuristic cityscape at sunset",
size="1024x1024",
)
print(response.data[0].url)
# With explicit model
response = client.images.generate(
model="google/gemini-2.0-flash-exp:free",
prompt="A serene mountain landscape",
size="1024x1024",
)
print(response.data[0].url)
Embeddings
response = client.embeddings.create(
model="openai/text-embedding-3-small",
input="Hello, world!"
)
print(response.data[0].embedding[:5]) # First 5 dimensions
Structured Output with Pydantic
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
city: str
# Parse response directly into Pydantic model
response = client.beta.chat.completions.parse(
model="openai/gpt-4o",
messages=[
{"role": "user", "content": "Extract: John is 30 years old and lives in Tokyo"}
],
response_format=Person,
)
person = response.choices[0].message.parsed
print(f"{person.name}, {person.age}, {person.city}") # John, 30, Tokyo
JSON Schema Response Format
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "List 3 colors"}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "colors",
"schema": {
"type": "object",
"properties": {
"colors": {"type": "array", "items": {"type": "string"}}
},
"required": ["colors"]
}
}
}
)
Async Usage
import asyncio
from cmdop_llm import AsyncCmdopLLM
async def main():
client = AsyncCmdopLLM()
# Chat
response = await client.chat.completions.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
# Web Search
result = await client.search.web("Latest tech news")
print(result.content)
# Vision (auto model)
result = await client.vision.analyze(
image_url="https://example.com/photo.jpg"
)
print(result.description)
asyncio.run(main())
List Available Models
from cmdop_llm import CmdopLLM
client = CmdopLLM()
# List all available models
models = client.models.list()
for model in models.data:
print(f"{model.id}: {model.name} ({model.context_length} tokens)")
# Filter by provider
anthropic_models = client.models.list(provider="anthropic")
# Filter by vision support
vision_models = client.models.list(supports_vision=True)
# Filter by context length (min 100k tokens)
long_context = client.models.list(min_context_length=100000)
# Filter by price (max $1 per 1M prompt tokens)
cheap_models = client.models.list(max_prompt_price=1.0)
# Search by name or ID
gpt_models = client.models.list(search="gpt-4")
# Combine filters
result = client.models.list(
provider="openai",
supports_vision=True,
min_context_length=128000,
max_prompt_price=10.0,
)
# Get specific model details
model = client.models.retrieve("anthropic/claude-sonnet-4")
print(f"Name: {model.name}")
print(f"Context: {model.context_length}")
print(f"Prompt price: ${model.pricing.prompt_cost_per_million()}/1M tokens")
print(f"Vision support: {model.supports_vision}")
Available Models
Access 200+ models including:
- OpenAI: gpt-4o, gpt-4o-mini, gpt-4.1, o1, o3-mini
- Anthropic: claude-sonnet-4, claude-opus-4, claude-3.5-haiku
- Google: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash
- Meta: llama-4-maverick, llama-3.3-70b
- Mistral: mistral-large, mistral-medium
- Image Gen: gemini-2.0-flash-exp (free), flux-pro, dall-e-3
Use model format: provider/model-name (e.g., openai/gpt-4o)
Auto Model Selection
For Vision, OCR, and Image Generation endpoints, the model parameter is optional. When omitted, the server automatically selects the most cost-effective model:
# Server picks cheapest vision model
result = client.vision.analyze(image_url="https://...")
# Server picks cheapest OCR-capable model
result = client.ocr.extract(image_url="https://...")
# Server picks cheapest image generation model
result = client.images.generate(prompt="A sunset")
API Reference
CmdopLLM
CmdopLLM(
api_key: str = None, # From CMDOP_API_KEY env if not set
base_url: str = None, # Default: https://llm.cmdop.com/v1
timeout: float = None, # Request timeout
max_retries: int = 2, # Retry count
)
Resources
| Resource | Description |
|---|---|
client.chat.completions |
Chat completions (OpenAI compatible) |
client.beta.chat.completions.parse() |
Structured output with Pydantic |
client.embeddings |
Text embeddings (OpenAI compatible) |
client.images |
Image generation (OpenAI compatible, auto model) |
client.models |
List and filter available models with pricing |
client.vision |
Vision analysis (CMDOP specific, auto model) |
client.ocr |
OCR extraction (CMDOP specific, auto model) |
client.search |
Web search and URL fetch (CMDOP specific) |
Models Methods
# List models with optional filters
client.models.list(
provider: str = None, # Filter by provider (e.g., "openai", "anthropic")
supports_vision: bool = None, # Filter for vision-capable models
min_context_length: int = None, # Minimum context length
max_prompt_price: float = None, # Max prompt price per 1M tokens
max_completion_price: float = None, # Max completion price per 1M tokens
search: str = None, # Search in model ID and name
refresh: bool = False, # Force refresh from API
) -> ModelsResponse
# Get specific model by ID
client.models.retrieve(
model_id: str, # e.g., "anthropic/claude-sonnet-4"
) -> Model
Search Methods
# Web search with AI-summarized results
client.search.web(
query: str, # Search query
model: str = "claude-3-5-haiku-20241022",
max_tokens: int = 1024,
max_searches: int = 5, # Max web searches (1-10)
allowed_domains: list[str] = None,
blocked_domains: list[str] = None,
user_location: UserLocation = None,
) -> WebSearchResponse
# Fetch and analyze URL content
client.search.fetch(
url: str, # URL to fetch
prompt: str = "Summarize this page",
model: str = "claude-3-5-haiku-20241022",
max_tokens: int = 1024,
) -> WebSearchResponse
Response Types
# ModelsResponse
response.object # "list"
response.data # List of Model objects
# Model
model.id # Model ID (e.g., "openai/gpt-4o")
model.name # Display name
model.description # Model description (optional)
model.context_length # Max context length in tokens
model.pricing # ModelPricing object
model.architecture # ModelArchitecture (optional)
model.top_provider # TopProvider info (optional)
model.created # Unix timestamp (optional)
model.owned_by # Provider name (property)
model.supports_vision # Vision capability (property)
# ModelPricing
pricing.prompt # Cost per prompt token (string)
pricing.completion # Cost per completion token (string)
pricing.image # Cost per image (optional)
pricing.request # Cost per request (optional)
pricing.prompt_cost_per_million() # Returns float
pricing.completion_cost_per_million() # Returns float
# ModelArchitecture
architecture.tokenizer # Tokenizer type (e.g., "GPT")
architecture.instruct_type # Instruction format
architecture.modality # e.g., "text->text", "text+image->text"
# WebSearchResponse
response.id # Unique response ID
response.content # AI-generated response
response.citations # List of SearchCitation
response.model # Model used
response.usage # SearchUsage (input_tokens, output_tokens)
response.stop_reason # Why response stopped
# SearchCitation
citation.title # Source page title
citation.url # Source URL
citation.cited_text # Quoted text (optional)
# UserLocation
UserLocation(
country="US", # ISO 3166-1 alpha-2 code
city="New York",
region="NY",
timezone="America/New_York",
)
# VisionAnalyzeResponse
response.description # Image description
response.extracted_text # Text found in image
response.model # Model used
response.cost_usd # Cost in USD
response.tokens_input # Input tokens used
response.tokens_output # Output tokens used
# OCRResponse
response.text # Extracted text
response.model # Model used
response.cost_usd # Cost in USD
response.tokens_input # Input tokens used
response.tokens_output # Output tokens used
Error Handling
from cmdop_llm import CmdopLLM, BadRequestError, RateLimitError, AuthenticationError
client = CmdopLLM()
try:
response = client.chat.completions.create(
model="invalid/model",
messages=[{"role": "user", "content": "Hello"}]
)
except BadRequestError as e:
print(f"Invalid request: {e}")
except RateLimitError as e:
print(f"Rate limited: {e}")
except AuthenticationError as e:
print(f"Auth error: {e}")
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cmdop_llm-0.1.8.tar.gz.
File metadata
- Download URL: cmdop_llm-0.1.8.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94c0b0b9e64404e51e156c745cae046c2d5847cf7a89241915f6e5af622f20ae
|
|
| MD5 |
50ed7bf3dd6dc70ee2f8ef0a99127565
|
|
| BLAKE2b-256 |
4b3a8da4f1b140b14c8eb0a9d805f1b527c06a754ee70a9c608398f23e79c649
|
File details
Details for the file cmdop_llm-0.1.8-py3-none-any.whl.
File metadata
- Download URL: cmdop_llm-0.1.8-py3-none-any.whl
- Upload date:
- Size: 22.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac2a613d5b0fa989a0929e62a86197fc58744a3473763881874914e18584f8d0
|
|
| MD5 |
6044ed61cbc1241e59ac84064fac6e0a
|
|
| BLAKE2b-256 |
8f8f07661585c5ad2e776ce64438f5cbc3f86fdf7e4decbfc5985eaea0b9ca26
|