Skip to main content

Drop-in replacements for OpenAI, Anthropic, Google, xAI, and Meta SDKs with built-in AI governance. One import change adds permit-first policy enforcement, budget controls, audit trails, and usage reporting to every AI call.

Project description

keel-sdk

Python SDK for the Keel AI governance API.

Keel lets you issue permits before AI calls, enforce policies, track usage, and audit decisions — across any provider.

Install

pip install keel-sdk

Quick Start — One-Line Migration

Add Keel governance to your existing AI code with a single import change. No other code modifications needed.

OpenAI:

# BEFORE:
from openai import OpenAI

# AFTER:
from keel_sdk.providers.openai import OpenAI

client = OpenAI()  # Uses KEEL_BASE_URL, KEEL_API_KEY, KEEL_PROJECT_ID env vars
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=100,
)
print(response["choices"][0]["message"]["content"])

Anthropic:

# BEFORE:
from anthropic import Anthropic

# AFTER:
from keel_sdk.providers.anthropic import Anthropic

client = Anthropic()  # Uses KEEL_BASE_URL, KEEL_API_KEY, KEEL_PROJECT_ID env vars
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=1024,
)
print(response["content"][0]["text"])

Google (Gemini):

# BEFORE:
from google.generativeai import GenerativeModel

# AFTER:
from keel_sdk.providers.google import GenerativeModel

model = GenerativeModel("gemini-pro")
response = model.generate_content("Hello!")
print(response["candidates"][0]["content"]["parts"][0]["text"])

xAI (Grok):

# BEFORE:
from openai import OpenAI; client = OpenAI(base_url="https://api.x.ai/v1", api_key="xai-...")

# AFTER:
from keel_sdk.providers.xai import Grok

client = Grok()
response = client.chat.completions.create(
    model="grok-2",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response["choices"][0]["message"]["content"])

Meta (Llama):

# BEFORE:
from openai import OpenAI; client = OpenAI(base_url="https://api.llama-api.com", ...)

# AFTER:
from keel_sdk.providers.meta import Llama

client = Llama()
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response["choices"][0]["message"]["content"])

Every call transparently: requests a permit (policy + budget check), executes via Keel's proxy, reports actual token usage, and records audit evidence.

Provider Wrappers

The provider wrappers in keel_sdk.providers give you drop-in replacements for the official OpenAI, Anthropic, Google (Gemini), xAI (Grok), and Meta (Llama) SDKs. They do not depend on or import the provider packages. Instead, they talk directly to Keel's proxy endpoints.

Configuration

Set environment variables or pass explicitly:

from keel_sdk.providers.openai import OpenAI

# Via env vars (recommended):
#   KEEL_BASE_URL=https://api.keelapi.com
#   KEEL_API_KEY=keel_sk_...
#   KEEL_PROJECT_ID=proj_...
client = OpenAI()

# Or pass explicitly:
client = OpenAI(
    keel_base_url="https://api.keelapi.com",
    keel_api_key="keel_sk_...",
    keel_project_id="proj_...",
    keel_subject={"type": "user", "id": "usr_42"},  # optional, defaults to service/default
)

The api_key parameter (for OpenAI compatibility) is accepted but ignored — Keel manages provider keys.

Governance Flow

On every create() call, the wrapper:

  1. PermitPOST /v1/permits with provider, model, estimated tokens
  2. Deny check — if the permit decision is not "allow", raises KeelError(403, "permit_denied", reason)
  3. ProxyPOST /v1/proxy/openai (or /anthropic, /google, /xai, /meta) with the provider-shaped request body
  4. Usage reportPOST /v1/permits/{permit_id}/usage with actual tokens from the response

Async Support

from keel_sdk.providers.openai import AsyncOpenAI
from keel_sdk.providers.anthropic import AsyncAnthropic
from keel_sdk.providers.google import AsyncGenerativeModel
from keel_sdk.providers.xai import AsyncGrok
from keel_sdk.providers.meta import AsyncLlama

client = AsyncOpenAI()
response = await client.chat.completions.create(model="gpt-4o", messages=[...])

client = AsyncAnthropic()
response = await client.messages.create(model="claude-sonnet-4-20250514", messages=[...], max_tokens=1024)

model = AsyncGenerativeModel("gemini-pro")
response = await model.generate_content_async("Hello!")

client = AsyncGrok()
response = await client.chat.completions.create(model="grok-2", messages=[...])

client = AsyncLlama()
response = await client.chat.completions.create(model="llama-3.3-70b", messages=[...])

Streaming

Pass stream=True to get an iterator of chunks:

for chunk in client.chat.completions.create(model="gpt-4o", messages=[...], stream=True):
    print(chunk)

Error Handling

from keel_sdk import KeelError
from keel_sdk.providers.openai import OpenAI

client = OpenAI()
try:
    response = client.chat.completions.create(model="gpt-4o", messages=[...])
except KeelError as e:
    if e.code == "permit_denied":
        print(f"Blocked by policy: {e.message}")
    else:
        print(f"API error {e.status}: {e.message}")

Setup

from keel_sdk import KeelClient

client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
)

Permits

Request a permit before making an AI call:

import uuid
from keel_sdk import KeelClient

client = KeelClient(base_url="https://api.keelapi.com", api_key="keel_sk_...")

permit = client.permits.create({
    "project_id": "proj_123",
    "idempotency_key": str(uuid.uuid4()),
    "subject": {"type": "user", "id": "usr_123"},
    "action": {"name": "ai.generate"},
    "resource": {
        "type": "request",
        "id": "req_123",
        "attributes": {
            "provider": "openai",
            "model": "gpt-4o-mini",
            "estimated_input_tokens": 200,
            "estimated_output_tokens": 500,
        },
    },
})

if permit["decision"] == "allow":
    # proceed with AI call
    pass

Async variant:

permit = await client.permits.create_async({...})

Dry run

result = client.permits.dry_run(permit_request)

List and get

permits = client.permits.list(project_id="proj_123", limit=50)
permit = client.permits.get("permit_id")

Report usage

client.permits.report_usage("permit_id", {
    "input_tokens": 180,
    "output_tokens": 420,
})

Attestation, evidence, lineage

client.permits.attest("permit_id", {"outcome": "success"})
client.permits.add_evidence("permit_id", {"label": "response_hash", "value": "abc123"})
evidence = client.permits.list_evidence("permit_id")
lineage = client.permits.lineage("permit_id")
bundle = client.permits.bundle("permit_id")

Executions

Run a model synchronously:

result = client.executions.create({
    "provider": "openai",
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Summarize this document."}],
    "permit_id": permit["id"],
})

Stream tokens as they arrive:

for event in client.executions.stream({
    "provider": "openai",
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Write a poem."}],
    "permit_id": permit["id"],
}):
    if event["event_type"] == "content_delta":
        print(event["data"], end="", flush=True)
    if event["event_type"] == "done":
        print()

Async streaming:

async for event in client.executions.stream_async({...}):
    ...

Execute (unified)

result = client.execute.run({
    "model": "gpt-4o-mini",
    "input": "Translate to Spanish: Hello world",
    "provider": "openai",
})

Proxy

Pass requests through to providers with Keel governance applied:

response = client.proxy.openai({
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello"}],
})

# Also: client.proxy.anthropic(), .google(), .xai(), .meta()

Jobs

Submit async jobs and poll for results:

job = client.jobs.create({
    "provider": "openai",
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Analyze this dataset."}],
})

status = client.jobs.get(job["job_id"])
# status["status"]: "pending" | "running" | "completed" | "failed"

API Keys

key = client.api_keys.create()
keys = client.api_keys.list()
single = client.api_keys.get(key["id"])
client.api_keys.revoke(key["id"])

Request Timeline

timeline = client.requests.timeline("request_id")

Error Handling

from keel_sdk import KeelClient, KeelError

try:
    client.permits.create(request)
except KeelError as e:
    print(e.status)        # HTTP status code
    print(e.code)          # e.g. "permit_denied"
    print(e.message)       # human-readable message
    print(e.field)         # field that caused the error, if any
    print(e.is_retryable)  # True for 408, 429, 500, 502, 503, 504
    print(e.retry_after)   # seconds from Retry-After header, or None

Automatic Retries

The SDK can automatically retry failed requests that return transient HTTP errors (408, 429, 500, 502, 503, 504) using exponential backoff with jitter.

from keel_sdk import KeelClient, RetryConfig

# Use default retry settings (3 retries, 0.5s initial delay, 2x backoff)
client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
    retry_config=RetryConfig(),
)

# Customize retry behavior
client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
    retry_config=RetryConfig(
        max_retries=5,
        initial_delay=1.0,
        max_delay=60.0,
        backoff_multiplier=3.0,
    ),
)

To disable retries (the default), omit retry_config or pass None:

client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
    retry_config=None,  # no automatic retries
)

When a response includes a Retry-After header, the SDK respects that value instead of the computed backoff delay. Streaming calls are never retried.

Per-Request Timeout

Every method that makes an HTTP call accepts an optional timeout parameter that overrides the client-level timeout for that single request:

# Client-level timeout is 30 s (the default)
client = KeelClient(base_url="https://api.keelapi.com", api_key="keel_sk_...", timeout=30.0)

# This specific call uses a 5 s timeout instead
permit = client.request("POST", "/v1/permits", json=body, timeout=5.0)

# _get and streaming helpers also accept timeout
timeline = client._get("/v1/requests/req_1/timeline", timeout=10.0)

Pass timeout=None (or omit it) to use the client-level default.

Context Manager

with KeelClient(base_url="...", api_key="...") as client:
    permit = client.permits.create({...})

# Async
async with KeelClient(base_url="...", api_key="...") as client:
    permit = await client.permits.create_async({...})

Freshness Headers

For replay-protected endpoints:

client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
    request_freshness=True,
)

This adds X-Keel-Timestamp and X-Keel-Nonce to every request.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keel_sdk-0.2.2.tar.gz (36.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keel_sdk-0.2.2-py3-none-any.whl (31.4 kB view details)

Uploaded Python 3

File details

Details for the file keel_sdk-0.2.2.tar.gz.

File metadata

  • Download URL: keel_sdk-0.2.2.tar.gz
  • Upload date:
  • Size: 36.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for keel_sdk-0.2.2.tar.gz
Algorithm Hash digest
SHA256 9b66069ae27e16472006c2a9476b3b579aff9da3c2f10a554da962ca2df1cea0
MD5 d55f4eae2a9bb2fedeaabaedb4f3f706
BLAKE2b-256 e0a64679072f669e1e718cc8afe4c64182b3c113998c52da075a55c718e4ae66

See more details on using hashes here.

File details

Details for the file keel_sdk-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: keel_sdk-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 31.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for keel_sdk-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7dc151e9eaf0ef89d43529bcc621b540979bad86b7f98edc558d90e4a74b1762
MD5 b8b8b70c865ec63cd4a44ad51873044d
BLAKE2b-256 e1cc8e4e823f327b50e395d9494ebe16c2d7d0ab965e20d3c1d7084aeacc3755

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page