Skip to main content

Official Python SDK for the Relay AI Gateway. One key, every model.

Project description

Relay AI SDK

Official Python SDK for the Relay AI Gateway. One key, every model.

pip install relay-ai-sdk

With OpenTelemetry:

pip install relay-ai-sdk[otel]

Quick start

from relay_ai import Relay

client = Relay(api_key="sk-relay-...")

response = client.chat("claude-sonnet-4.6", messages=[
    {"role": "user", "content": "Explain quantum computing in one sentence."}
])
print(response.text)
print(f"Tokens: {response.usage.total_tokens}")

Streaming

with client.chat("gemini-3.5-flash", messages=[
    {"role": "user", "content": "Write a haiku about code."}
], stream=True) as stream:
    for chunk in stream:
        print(chunk.text, end="", flush=True)

    final = stream.get_final_response()
    print(f"\nTokens: {final.usage.total_tokens}")

Async

from relay_ai import AsyncRelay

async with AsyncRelay() as client:
    response = await client.chat("claude-opus-4.8", messages=[
        {"role": "user", "content": "Hello!"}
    ])
    print(response.text)

Tool calling

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

response = client.chat("claude-sonnet-4.6", messages=[
    {"role": "user", "content": "What's the weather in Tokyo?"}
], tools=tools)

for tc in response.tool_calls:
    print(f"{tc.function_name}({tc.function_arguments})")

Image generation

result = client.images("flux-schnell", prompt="A cat astronaut on Mars")
print(result.images[0])

Audio

# Transcription
transcript = client.transcribe("whisper-1", "meeting.mp3")
print(transcript.text)

# Text-to-speech
audio = client.speech("tts-1", "Hello from Relay!")
with open("output.mp3", "wb") as f:
    f.write(audio.audio)

Semantic routing

decision = client.route(
    messages=[{"role": "user", "content": "Prove the Riemann hypothesis"}],
    candidates=["claude-opus-4.8", "claude-sonnet-4.6", "gemini-3.5-flash"],
)
print(f"Best model: {decision.alias} ({decision.confidence:.0%})")
print(f"Reasoning: {decision.reasoning}")

Batch processing

results = client.batch("claude-sonnet-4.6", [
    {"messages": [{"role": "user", "content": "What is 2+2?"}]},
    {"messages": [{"role": "user", "content": "What is 3+3?"}]},
    {"messages": [{"role": "user", "content": "What is 4+4?"}]},
], max_concurrent=5)

for r in results:
    if r.response:
        print(f"[{r.index}] {r.response.text}")
    else:
        print(f"[{r.index}] Error: {r.error}")

Credits

state = client.credits()
print(f"Balance: ${state.balance_cents / 100:.2f}")

Error handling

from relay_ai import (
    RelayError,
    AuthenticationError,
    RateLimitError,
    InsufficientCreditsError,
    ModelNotFoundError,
)

try:
    response = client.chat("gpt-5", messages=[...])
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")
except InsufficientCreditsError:
    print("Top up your credits at relay.ai5labs.com")
except ModelNotFoundError:
    print("Model not found")
except RelayError as e:
    print(f"Error: {e.message}")

CLI

export RELAY_API_KEY=sk-relay-...

relay models                                # List models
relay chat claude-sonnet-4.6 "Hello!"       # Quick chat
relay chat gemini-3.5-flash "Hi" --stream   # Stream tokens
relay credits                               # Check balance
relay version                               # SDK version

Configuration

client = Relay(
    api_key="sk-relay-...",       # or set RELAY_API_KEY env var
    base_url="https://...",       # custom gateway URL
    timeout=120.0,                # request timeout (seconds)
    max_retries=2,                # automatic retries on 429/5xx
    send_telemetry=True,          # usage analytics (metadata only)
    http_client=httpx.Client(),   # custom httpx client
)

Telemetry

The SDK sends anonymous usage metadata (model, token counts, latency) to improve the service. No message content, prompts, responses, or tool arguments are ever transmitted. This is enforced by a client-side allowlist and verified by server-side stripping.

Disable with:

client = Relay(send_telemetry=False)

OpenTelemetry

from relay_ai import Relay
from relay_ai._otel import instrument, RelaySpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

provider = TracerProvider()
provider.add_span_processor(
    BatchSpanProcessor(
        RelaySpanExporter(api_key="sk-relay-...", base_url="https://api.relay.ai5labs.com/v1")
    )
)

client = instrument(Relay())
response = client.chat(...)  # Automatically creates OTel spans

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

relay_ai_sdk-2.0.3.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

relay_ai_sdk-2.0.3-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file relay_ai_sdk-2.0.3.tar.gz.

File metadata

  • Download URL: relay_ai_sdk-2.0.3.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for relay_ai_sdk-2.0.3.tar.gz
Algorithm Hash digest
SHA256 e6d3368591d1b91203e96de166ef4e709327754d258edfbd183f39cc9961ea39
MD5 01d2bb94c3c6a9ea57f52fbcb730e8eb
BLAKE2b-256 524f5862c3486f9d834f83bcdeac0ba14d8f6bfa9e602a8f3d374ea93076ff21

See more details on using hashes here.

File details

Details for the file relay_ai_sdk-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: relay_ai_sdk-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.13

File hashes

Hashes for relay_ai_sdk-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2b0b2861096c78e7c24794dbc0174eda3b9219d3dc359ba935157e17dc5c8264
MD5 14249de74eb571be35329ff4be597cdf
BLAKE2b-256 19b23a434ac45ad425240f395b6bc1a0791e1e308f7fcc5354aad3e1bcd03076

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page