Official Python SDK for Apertis AI API
Project description
Apertis Python SDK
Official Python SDK for the Apertis AI API.
Installation
pip install apertis
Quick Start
from apertis import Apertis
client = Apertis(api_key="your-api-key")
# Or set APERTIS_API_KEY environment variable
response = client.chat.completions.create(
model="gpt-5.4",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
Features
- Sync and Async Support: Both synchronous and asynchronous clients
- Streaming: Real-time streaming for chat completions
- Tool Calling: Function/tool calling support
- Embeddings: Text embedding generation
- Vision/Image: Analyze images with multimodal models
- Audio I/O: Audio input and output support
- Video: Video content analysis
- Web Search: Real-time web search integration
- Context Compression: Compress long conversation history to save tokens
- Reasoning Mode: Chain-of-thought reasoning support
- Extended Thinking: Deep thinking for complex problems
- Messages API: Anthropic-native message format
- Responses API: OpenAI Responses API format
- Rerank API: Document reranking for RAG
- Models API: List and retrieve available models
- Type Hints: Full type annotations for IDE support
- Automatic Retries: Built-in retry logic for transient errors
Usage
Chat Completions
from apertis import Apertis
client = Apertis()
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "What is the capital of France?"}],
temperature=0.7,
max_tokens=100,
)
print(response.choices[0].message.content)
print(f"Tokens used: {response.usage.total_tokens}")
Streaming
stream = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Vision / Image Analysis
Analyze images using multimodal models with the convenience method:
# Using image URL
response = client.chat.completions.create_with_image(
model="gpt-5.4",
prompt="What is in this image?",
image="https://example.com/photo.jpg",
)
print(response.choices[0].message.content)
# Using local file (automatically base64 encoded)
response = client.chat.completions.create_with_image(
model="gpt-5.4",
prompt="Describe this image",
image="/path/to/local/image.png",
)
# Multiple images
response = client.chat.completions.create_with_image(
model="gpt-5.4",
prompt="Compare these images",
image=["https://example.com/img1.jpg", "https://example.com/img2.jpg"],
system="Be detailed in your comparison.",
)
Or use the standard API format:
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/cat.jpg"}}
]
}]
)
Audio Input/Output
# Audio input
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What does this audio say?"},
{"type": "input_audio", "input_audio": {"data": "base64_audio_data", "format": "wav"}}
]
}]
)
# Audio output
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Say hello in a friendly voice"}],
modalities=["text", "audio"],
audio={"voice": "alloy", "format": "wav"},
)
print(response.choices[0].message.audio.transcript)
Web Search
Enable real-time web search for up-to-date information:
# Using convenience method
response = client.chat.completions.create_with_web_search(
prompt="What are the latest news about AI?",
context_size="high",
country="US",
)
# Check citations
for annotation in response.choices[0].message.annotations or []:
print(f"Source: {annotation.url}")
# Using standard API
response = client.chat.completions.create(
model="gpt-5-search-api",
messages=[{"role": "user", "content": "Latest Python releases?"}],
web_search_options={
"search_context_size": "medium",
"filters": ["python.org", "github.com"],
},
)
Reasoning Mode
Enable chain-of-thought reasoning for complex problems:
response = client.chat.completions.create(
model="glm-5.1",
messages=[{"role": "user", "content": "How many r's are in strawberry?"}],
reasoning={"enabled": True, "effort": "high"},
)
# Or use shorthand
response = client.chat.completions.create(
model="glm-5.1",
messages=[{"role": "user", "content": "Solve this math problem..."}],
reasoning_effort="high",
)
Extended Thinking (Gemini)
Use extended thinking for deep analysis:
response = client.chat.completions.create(
model="gemini-3-pro-preview",
messages=[{"role": "user", "content": "Analyze this complex problem..."}],
thinking={"type": "enabled"},
)
# With custom thinking budget
response = client.chat.completions.create(
model="gemini-3-pro-preview",
messages=[{"role": "user", "content": "Deep analysis needed..."}],
extra_body={
"google": {"thinking_config": {"thinking_budget": 10240}}
},
)
Context Compression
Reduce token usage in long conversations by compressing older message history:
# Chat Completions with compression
response = client.chat.completions.create(
model="gpt-4.1-mini",
messages=[
{"role": "user", "content": "Explain distributed systems"},
{"role": "assistant", "content": "Distributed systems are..."},
{"role": "user", "content": "Summarize the key points"},
],
compression={
"enabled": True,
"strategy": "aggressive", # "on", "conservative", or "aggressive"
"threshold": 8000, # Token threshold to trigger compression
"keep_turns": 6, # Recent turns to always preserve
"model": "auto", # Compression model ("auto" or specific model)
},
)
# Minimal config — just enable it
response = client.chat.completions.create(
model="gpt-4.1-mini",
messages=[{"role": "user", "content": "Hello!"}],
compression={"enabled": True},
)
# Also works with Messages API and Responses API
message = client.messages.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=1024,
compression={"enabled": True, "strategy": "conservative"},
)
Async Usage
import asyncio
from apertis import AsyncApertis
async def main():
client = AsyncApertis()
response = await client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
await client.close()
asyncio.run(main())
Async Streaming
async def stream_example():
client = AsyncApertis()
stream = await client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Tell me a joke"}],
stream=True,
)
async for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
await client.close()
Tool Calling
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city name"
}
},
"required": ["location"]
}
}
}],
tool_choice="auto",
)
if response.choices[0].message.tool_calls:
for tool_call in response.choices[0].message.tool_calls:
print(f"Function: {tool_call.function.name}")
print(f"Arguments: {tool_call.function.arguments}")
Embeddings
response = client.embeddings.create(
model="text-embedding-3-small",
input="Hello, world!",
)
embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")
Batch Embeddings
response = client.embeddings.create(
model="text-embedding-3-small",
input=["Hello", "World", "How are you?"],
)
for item in response.data:
print(f"Index {item.index}: {len(item.embedding)} dimensions")
Models API
List and retrieve available models:
# List all models
models = client.models.list()
for model in models.data:
print(f"{model.id} - owned by {model.owned_by}")
# Retrieve specific model
model = client.models.retrieve("gpt-5.4")
print(f"Model: {model.id}, Created: {model.created}")
Messages API (Anthropic Format)
Use Anthropic-native message format:
message = client.messages.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Hello, Claude!"}],
max_tokens=1024,
system="You are a helpful assistant.",
)
print(message.content[0].text)
print(f"Stop reason: {message.stop_reason}")
With tool use:
message = client.messages.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "What's the weather in Paris?"}],
max_tokens=1024,
tools=[{
"name": "get_weather",
"description": "Get weather for a location",
"input_schema": {
"type": "object",
"properties": {"location": {"type": "string"}},
"required": ["location"]
}
}],
)
if message.stop_reason == "tool_use":
for block in message.content:
if block.type == "tool_use":
print(f"Tool: {block.name}, Input: {block.input}")
Responses API
Use OpenAI Responses API format for advanced use cases:
response = client.responses.create(
model="gpt-5.4",
input="Write a haiku about programming",
)
print(response.output[0].content[0].text)
# With reasoning
response = client.responses.create(
model="gpt-5.4",
input="Solve this complex problem...",
reasoning={"effort": "high"},
)
Rerank API
Rerank documents for RAG applications:
results = client.rerank.create(
model="BAAI/bge-reranker-v2-m3",
query="What is machine learning?",
documents=[
"Machine learning is a subset of AI.",
"The weather is nice today.",
"Deep learning uses neural networks.",
],
top_n=2,
)
for result in results.results:
print(f"Index {result.index}: Score {result.relevance_score}")
# With document text in results
results = client.rerank.create(
model="BAAI/bge-reranker-v2-m3",
query="Python programming",
documents=["Python is great", "Java is popular"],
return_documents=True,
)
for result in results.results:
print(f"{result.document}: {result.relevance_score}")
Error Handling
from apertis import (
Apertis,
ApertisError,
APIError,
AuthenticationError,
RateLimitError,
NotFoundError,
)
client = Apertis()
try:
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Hello!"}]
)
except AuthenticationError as e:
print(f"Invalid API key: {e.message}")
except RateLimitError as e:
print(f"Rate limited. Status: {e.status_code}")
except NotFoundError as e:
print(f"Resource not found: {e.message}")
except APIError as e:
print(f"API error {e.status_code}: {e.message}")
except ApertisError as e:
print(f"Error: {e.message}")
Configuration
client = Apertis(
api_key="your-api-key", # Or use APERTIS_API_KEY env var
base_url="https://api.apertis.ai/v1", # Custom base URL
timeout=60.0, # Request timeout in seconds
max_retries=2, # Number of retries for failed requests
default_headers={"X-Custom": "value"}, # Additional headers
)
Context Manager
with Apertis() as client:
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Hello!"}]
)
# Client is automatically closed
# Async version
async with AsyncApertis() as client:
response = await client.chat.completions.create(...)
Available Models
Any model available on Apertis AI, including:
Chat Models
- OpenAI:
gpt-5.4,gpt-5.1,gpt-5.3-codex - Anthropic:
claude-opus-4-6,claude-sonnet-4-6,claude-haiku-4.5 - Google:
gemini-3-pro-preview,gemini-3-flash-preview,gemini-2.5-flash-preview - Other:
glm-5.1,minimax-m2.7, and 500+ more models
Reasoning Models
glm-5.1(with reasoning enabled)
Search Models
gpt-5-search-api
Embedding Models
text-embedding-3-small,text-embedding-3-large
Rerank Models
BAAI/bge-reranker-v2-m3Qwen/Qwen3-Reranker-0.6B,Qwen/Qwen3-Reranker-4B,Qwen/Qwen3-Reranker-8B
Requirements
- Python 3.9+
- httpx
- pydantic
Changelog
v0.2.0
- Added Vision/Image support with
create_with_image()convenience method - Added Audio input/output support
- Added Video content support
- Added Web Search with
create_with_web_search()convenience method - Added Reasoning Mode support
- Added Extended Thinking (Gemini) support
- Added Models API (
client.models.list(),client.models.retrieve()) - Added Responses API (
client.responses.create()) - Added Messages API (
client.messages.create()) for Anthropic format - Added Rerank API (
client.rerank.create()) - Added helper utilities for base64 encoding
v0.1.1
- Fixed API response compatibility issues
v0.1.0
- Initial release with Chat Completions, Streaming, Tool Calling, and Embeddings
License
Apache 2.0 - see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file apertis-0.2.2.tar.gz.
File metadata
- Download URL: apertis-0.2.2.tar.gz
- Upload date:
- Size: 33.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01582a88140e054678245c30cd5e5b62906b010c3b69b5942cfaea85f56a9072
|
|
| MD5 |
df463b299ec76e72921e7956d533cfdf
|
|
| BLAKE2b-256 |
419f55b07570ecab7372890b1c654c68caf0d9e6aa117adcfe1a9b79c7769cdc
|
File details
Details for the file apertis-0.2.2-py3-none-any.whl.
File metadata
- Download URL: apertis-0.2.2-py3-none-any.whl
- Upload date:
- Size: 34.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5f4dc006db3bb6f41d2a7d3a68299d872fe5e4a9f2bb5bb9d797abcca51c732c
|
|
| MD5 |
8a4f8f496fdacc3e3ee1e91814709e9f
|
|
| BLAKE2b-256 |
dc9b9f17631f78229e0d0e10cd7b1f9b712cd39ad0ba07a793b41d61ccdbc04a
|