Official Python SDK for the Relay AI Gateway. One key, every model.
Project description
Relay AI SDK
Official Python SDK for the Relay AI Gateway. One key, every model.
pip install ai5labs-relay
With OpenTelemetry:
pip install ai5labs-relay[otel]
Quick start
from relay_ai import Relay
client = Relay(api_key="sk-relay-...")
response = client.chat("claude-sonnet-4.6", messages=[
{"role": "user", "content": "Explain quantum computing in one sentence."}
])
print(response.text)
print(f"Tokens: {response.usage.total_tokens}")
Streaming
with client.chat("gemini-3.5-flash", messages=[
{"role": "user", "content": "Write a haiku about code."}
], stream=True) as stream:
for chunk in stream:
print(chunk.text, end="", flush=True)
final = stream.get_final_response()
print(f"\nTokens: {final.usage.total_tokens}")
Async
from relay_ai import AsyncRelay
async with AsyncRelay() as client:
response = await client.chat("claude-opus-4.8", messages=[
{"role": "user", "content": "Hello!"}
])
print(response.text)
Tool calling
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}]
response = client.chat("claude-sonnet-4.6", messages=[
{"role": "user", "content": "What's the weather in Tokyo?"}
], tools=tools)
for tc in response.tool_calls:
print(f"{tc.function_name}({tc.function_arguments})")
Image generation
result = client.images("flux-schnell", prompt="A cat astronaut on Mars")
print(result.images[0])
Audio
# Transcription
transcript = client.transcribe("whisper-1", "meeting.mp3")
print(transcript.text)
# Text-to-speech
audio = client.speech("tts-1", "Hello from Relay!")
with open("output.mp3", "wb") as f:
f.write(audio.audio)
Semantic routing
decision = client.route(
messages=[{"role": "user", "content": "Prove the Riemann hypothesis"}],
candidates=["claude-opus-4.8", "claude-sonnet-4.6", "gemini-3.5-flash"],
)
print(f"Best model: {decision.alias} ({decision.confidence:.0%})")
print(f"Reasoning: {decision.reasoning}")
Batch processing
results = client.batch("claude-sonnet-4.6", [
{"messages": [{"role": "user", "content": "What is 2+2?"}]},
{"messages": [{"role": "user", "content": "What is 3+3?"}]},
{"messages": [{"role": "user", "content": "What is 4+4?"}]},
], max_concurrent=5)
for r in results:
if r.response:
print(f"[{r.index}] {r.response.text}")
else:
print(f"[{r.index}] Error: {r.error}")
Credits
state = client.credits()
print(f"Balance: ${state.balance_cents / 100:.2f}")
Error handling
from relay_ai import (
RelayError,
AuthenticationError,
RateLimitError,
InsufficientCreditsError,
ModelNotFoundError,
)
try:
response = client.chat("gpt-5", messages=[...])
except AuthenticationError:
print("Invalid API key")
except RateLimitError as e:
print(f"Rate limited. Retry after {e.retry_after}s")
except InsufficientCreditsError:
print("Top up your credits at relay.ai5labs.com")
except ModelNotFoundError:
print("Model not found")
except RelayError as e:
print(f"Error: {e.message}")
CLI
export RELAY_API_KEY=sk-relay-...
relay models # List models
relay chat claude-sonnet-4.6 "Hello!" # Quick chat
relay chat gemini-3.5-flash "Hi" --stream # Stream tokens
relay credits # Check balance
relay version # SDK version
Configuration
client = Relay(
api_key="sk-relay-...", # or set RELAY_API_KEY env var
base_url="https://...", # custom gateway URL
timeout=120.0, # request timeout (seconds)
max_retries=2, # automatic retries on 429/5xx
send_telemetry=True, # usage analytics (metadata only)
http_client=httpx.Client(), # custom httpx client
)
Telemetry
The SDK sends anonymous usage metadata (model, token counts, latency) to improve the service. No message content, prompts, responses, or tool arguments are ever transmitted. This is enforced by a client-side allowlist and verified by server-side stripping.
Disable with:
client = Relay(send_telemetry=False)
OpenTelemetry
from relay_ai import Relay
from relay_ai._otel import instrument, RelaySpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
provider = TracerProvider()
provider.add_span_processor(
BatchSpanProcessor(
RelaySpanExporter(api_key="sk-relay-...", base_url="https://api.relay.ai5labs.com/v1")
)
)
client = instrument(Relay())
response = client.chat(...) # Automatically creates OTel spans
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai5labs_relay-2.0.0.tar.gz.
File metadata
- Download URL: ai5labs_relay-2.0.0.tar.gz
- Upload date:
- Size: 18.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
19eb9b60d69223d192163313c7c94e31b69f9528e002a0f65ada39cb5cfdc439
|
|
| MD5 |
ec432547fc478b4108491d6a16bc5f0e
|
|
| BLAKE2b-256 |
40e1dcb9993906193c81b4eb059d37648a7657c4859a31a42dce532d35ff69e9
|
File details
Details for the file ai5labs_relay-2.0.0-py3-none-any.whl.
File metadata
- Download URL: ai5labs_relay-2.0.0-py3-none-any.whl
- Upload date:
- Size: 20.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aef040bea6e95f45a1f3e1bffd3de0762f3ff56cfb073e806c36314d0dafb37a
|
|
| MD5 |
a84ad9969759ea18f4f70cdd1e381db2
|
|
| BLAKE2b-256 |
b6aab7e312190fd455bffa51fd3786a04a793415c6a9ab6aa4af00365d30fd37
|