Skip to main content

Summoned AI Gateway SDK — OpenAI-compatible Python client with routing, caching, guardrails, and multi-provider support

Project description

summoned-ai

Python SDK for the Summoned AI Gateway — OpenAI-compatible client with multi-provider routing, caching, guardrails, and more.

Install

pip install summoned-ai

Quick Start

from summoned_ai import Summoned

client = Summoned(api_key="sk-smnd-...", base_url="http://localhost:4000")

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

print(response["choices"][0]["message"]["content"])
print(response["summoned"])  # provider, cost, latency_ms, ...

Streaming

for chunk in client.chat.completions.create(
    model="anthropic/claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True,
):
    print(chunk["choices"][0]["delta"].get("content", ""), end="")

Config — Retries, Fallbacks, Caching, Guardrails

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    config={
        "retry": {"attempts": 3, "backoff": "exponential"},
        "fallback": ["anthropic/claude-sonnet-4-20250514", "groq/llama-3.3-70b-versatile"],
        "timeout": 30000,
        "cache": True,
        "guardrails": {
            "input": [{"type": "pii", "deny": True}],
            "output": [{"type": "contains", "params": {"operator": "none", "words": ["confidential"]}, "deny": True}],
        },
    },
)

Prompt Management — Versioned Templates

Store prompt templates server-side and reference them by slug. Variables interpolate at request time, every version is audit-logged, and you can roll back without touching app code.

# Create (needs admin_key)
admin = Summoned(api_key="sk-smnd-...", admin_key="your-admin-key")

admin.admin.prompts.create(
    slug="customer-support",
    tenant_id="default",
    template=[
        {"role": "system", "content": "You are a {{tone}} support agent."},
        {"role": "user",   "content": "{{user_question}}"},
    ],
    variables={"tone": "friendly"},
    default_model="openai/gpt-4o",
)

# Use — prompt_id + prompt_variables are first-class kwargs
response = client.chat.completions.create(
    prompt_id="customer-support",                                 # or "customer-support@3"
    prompt_variables={"user_question": "Where's my order?"},
)

# Manage
admin.admin.prompts.list(tenant_id="default")
admin.admin.prompts.get("customer-support", tenant_id="default")  # or admin.admin.prompts.get("prm_abc123")
admin.admin.prompts.versions("customer-support", tenant_id="default")
admin.admin.prompts.delete("prm_abc123")

When a prompt has a default_model, model= on chat.completions.create is optional. Caller-supplied model always wins.

with_config — Reusable Client Configuration

cached_client = client.with_config({"cache": True, "cacheTtl": 3600})

# All requests through cached_client use caching
cached_client.chat.completions.create(model="openai/gpt-4o", messages=[...])

Use with OpenAI's SDK

from openai import OpenAI
from summoned_ai import create_headers

client = OpenAI(
    base_url="http://localhost:4000/v1",
    api_key="sk-smnd-...",
    default_headers=create_headers(config={"cache": True}),
)

# Routes through the Summoned gateway with all features
res = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

Async Client

from summoned_ai import AsyncSummoned

async with AsyncSummoned(api_key="sk-smnd-...") as client:
    response = await client.chat.completions.create(
        model="openai/gpt-4o",
        messages=[{"role": "user", "content": "Hello"}],
    )

Embeddings

result = client.embeddings.create(
    model="openai/text-embedding-3-small",
    input="The quick brown fox",
)

Admin API

client = Summoned(api_key="sk-smnd-...", admin_key="your-admin-key")

# API keys
key = client.admin.keys.create(name="production", tenant_id="tenant_1")
keys = client.admin.keys.list("tenant_1")
client.admin.keys.revoke("key_abc")

# Virtual keys (encrypted provider credentials)
vk = client.admin.virtual_keys.create(
    name="my-openai-key",
    tenant_id="tenant_1",
    provider_id="openai",
    api_key="sk-real-openai-key-...",
)

# Logs & stats
logs = client.admin.logs.list(limit=50)
stats = client.admin.stats.get("24h")
providers = client.admin.providers.list()

Debug Mode

client = Summoned(api_key="sk-smnd-...", debug=True)
# Logs request/response details via Python's logging module

Response Headers

client.chat.completions.create(model="openai/gpt-4o", messages=[...])

print(client.last_response_headers)
# ResponseHeaders(provider='openai', cache='MISS', latency_ms='432', ...)

Error Handling

from summoned_ai import SummonedError

try:
    client.chat.completions.create(model="openai/gpt-4o", messages=[...])
except SummonedError as e:
    print(e.status_code)  # 429
    print(e.code)         # "RATE_LIMITED"
    print(e.headers)      # response headers

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

summoned_ai-0.2.0.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

summoned_ai-0.2.0-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file summoned_ai-0.2.0.tar.gz.

File metadata

  • Download URL: summoned_ai-0.2.0.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for summoned_ai-0.2.0.tar.gz
Algorithm Hash digest
SHA256 63a24ffcf5a9ee65a3777f363fc2798184cafec0c199c1a470989f9a56217def
MD5 b5a83cd60821ff2a16c1262e58946aa4
BLAKE2b-256 a2039b5e0212a641da9503737e796c0345a5d4dd7400c40f6aa7bbdb12f3e008

See more details on using hashes here.

Provenance

The following attestation bundles were made for summoned_ai-0.2.0.tar.gz:

Publisher: publish.yml on summoned-tech/summoned-sdk-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file summoned_ai-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: summoned_ai-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for summoned_ai-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 49fd786c7211761762811f5f539f1f006f97aac88ce747cb27167aac141c810b
MD5 0db0f13820b39e0ec85b03a2827e1fee
BLAKE2b-256 cd0ddee77565a24b13bdbd6c4549c8d4fe1621f0e8d3a100f91c083e23c98546

See more details on using hashes here.

Provenance

The following attestation bundles were made for summoned_ai-0.2.0-py3-none-any.whl:

Publisher: publish.yml on summoned-tech/summoned-sdk-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page