Skip to main content

Drop-in replacements for OpenAI, Anthropic, Google, xAI, and Meta SDKs with built-in AI governance. One import change adds permit-first policy enforcement, budget controls, audit trails, and usage reporting to every AI call.

Project description

keel-sdk

Python SDK for the Keel AI governance API.

Keel lets you issue permits before AI calls, enforce policies, track usage, and audit decisions — across any provider.

⚠️ Keel is currently in private beta. You'll need a Keel account and API key to use this SDK. Sign up for early access →

Install

pip install keel-sdk

Quick Start — One-Line Migration

Add Keel governance to your existing AI code with a single import change. No other code modifications needed.

OpenAI:

# BEFORE:
from openai import OpenAI

# AFTER:
from keel_sdk.providers.openai import OpenAI

client = OpenAI()  # Uses KEEL_BASE_URL, KEEL_API_KEY, KEEL_PROJECT_ID env vars
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=100,
)
print(response["choices"][0]["message"]["content"])

Anthropic:

# BEFORE:
from anthropic import Anthropic

# AFTER:
from keel_sdk.providers.anthropic import Anthropic

client = Anthropic()  # Uses KEEL_BASE_URL, KEEL_API_KEY, KEEL_PROJECT_ID env vars
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=1024,
)
print(response["content"][0]["text"])

Google (Gemini):

# BEFORE:
from google.generativeai import GenerativeModel

# AFTER:
from keel_sdk.providers.google import GenerativeModel

model = GenerativeModel("gemini-2.0-flash")
response = model.generate_content("Hello!")
print(response["candidates"][0]["content"]["parts"][0]["text"])

xAI (Grok):

# BEFORE:
from openai import OpenAI; client = OpenAI(base_url="https://api.x.ai/v1", api_key="xai-...")

# AFTER:
from keel_sdk.providers.xai import Grok

client = Grok()
response = client.chat.completions.create(
    model="grok-2",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response["choices"][0]["message"]["content"])

Meta (Llama):

# BEFORE:
from openai import OpenAI; client = OpenAI(base_url="https://api.llama-api.com", ...)

# AFTER:
from keel_sdk.providers.meta import Llama

client = Llama()
response = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response["choices"][0]["message"]["content"])

Every call transparently: requests a permit (policy + budget check), executes via Keel's proxy, reports actual token usage, and records audit evidence.

Provider Wrappers

The provider wrappers in keel_sdk.providers give you drop-in replacements for the official OpenAI, Anthropic, Google (Gemini), xAI (Grok), and Meta (Llama) SDKs. They do not depend on or import the provider packages. Instead, they talk directly to Keel's proxy endpoints.

Configuration

Set environment variables or pass explicitly:

from keel_sdk.providers.openai import OpenAI

# Via env vars (recommended):
#   KEEL_BASE_URL=https://api.keelapi.com
#   KEEL_API_KEY=keel_sk_...
#   KEEL_PROJECT_ID=proj_...
client = OpenAI()

# Or pass explicitly:
client = OpenAI(
    keel_base_url="https://api.keelapi.com",
    keel_api_key="keel_sk_...",
    keel_project_id="proj_...",
    keel_subject={"type": "user", "id": "usr_42"},  # optional, defaults to service/default
)

The api_key parameter (for OpenAI compatibility) is accepted but ignored — Keel manages provider keys.

Governance Flow

On every create() call, the wrapper:

  1. PermitPOST /v1/permits with provider, model, estimated tokens
  2. Deny check — if the permit decision is not "allow", raises KeelError(403, "permit_denied", reason)
  3. ProxyPOST /v1/proxy/openai (or /anthropic, /google, /xai, /meta) with the provider-shaped request body
  4. Usage reportPOST /v1/permits/{permit_id}/usage with actual tokens from the response

Async Support

from keel_sdk.providers.openai import AsyncOpenAI
from keel_sdk.providers.anthropic import AsyncAnthropic
from keel_sdk.providers.google import AsyncGenerativeModel
from keel_sdk.providers.xai import AsyncGrok
from keel_sdk.providers.meta import AsyncLlama

client = AsyncOpenAI()
response = await client.chat.completions.create(model="gpt-4o", messages=[...])

client = AsyncAnthropic()
response = await client.messages.create(model="claude-sonnet-4-20250514", messages=[...], max_tokens=1024)

model = AsyncGenerativeModel("gemini-2.0-flash")
response = await model.generate_content_async("Hello!")

client = AsyncGrok()
response = await client.chat.completions.create(model="grok-2", messages=[...])

client = AsyncLlama()
response = await client.chat.completions.create(model="llama-3.3-70b", messages=[...])

Streaming

Pass stream=True to get an iterator of chunks:

for chunk in client.chat.completions.create(model="gpt-4o", messages=[...], stream=True):
    print(chunk)

Error Handling

from keel_sdk import KeelError
from keel_sdk.providers.openai import OpenAI

client = OpenAI()
try:
    response = client.chat.completions.create(model="gpt-4o", messages=[...])
except KeelError as e:
    if e.code == "permit_denied":
        print(f"Blocked by policy: {e.message}")
    else:
        print(f"API error {e.status}: {e.message}")

Setup

from keel_sdk import KeelClient

client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
)

Permits

Request a permit before making an AI call:

import uuid
from keel_sdk import KeelClient

client = KeelClient(base_url="https://api.keelapi.com", api_key="keel_sk_...")

permit = client.permits.create({
    "project_id": "proj_123",
    "idempotency_key": str(uuid.uuid4()),
    "subject": {"type": "user", "id": "usr_123"},
    "action": {"name": "ai.generate"},
    "resource": {
        "type": "request",
        "id": "req_123",
        "attributes": {
            "provider": "openai",
            "model": "gpt-4o-mini",
            "estimated_input_tokens": 200,
            "estimated_output_tokens": 500,
        },
    },
})

if permit["decision"] == "allow":
    # proceed with AI call
    pass

Async variant:

permit = await client.permits.create_async({...})

Dry run

result = client.permits.dry_run(permit_request)

List and get

permits = client.permits.list(project_id="proj_123", limit=50)
permit = client.permits.get("permit_id")

Report usage

client.permits.report_usage("permit_id", {
    "actual_input_tokens": 180,
    "actual_output_tokens": 420,
})

Attestation, evidence, lineage

client.permits.attest("permit_id", {"outcome": "success"})
client.permits.add_evidence("permit_id", {"label": "response_hash", "value": "abc123"})
evidence = client.permits.list_evidence("permit_id")
lineage = client.permits.lineage("permit_id")
bundle = client.permits.bundle("permit_id")

Executions

Run a model synchronously:

result = client.executions.create({
    "provider": "openai",
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Summarize this document."}],
    "permit_id": permit["permit_id"],
})

Stream tokens as they arrive:

for event in client.executions.stream({
    "provider": "openai",
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Write a poem."}],
    "permit_id": permit["permit_id"],
}):
    if event["event_type"] == "content_delta":
        print(event["data"], end="", flush=True)
    if event["event_type"] == "done":
        print()

Async streaming:

async for event in client.executions.stream_async({...}):
    ...

Execute (unified)

result = client.execute.run({
    "model": "gpt-4o-mini",
    "input": "Translate to Spanish: Hello world",
    "provider": "openai",
})

Proxy

Pass requests through to providers with Keel governance applied:

response = client.proxy.openai({
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Hello"}],
})

# Also: client.proxy.anthropic(), .google(), .xai(), .meta()

Jobs

Submit async jobs and poll for results:

job = client.jobs.create({
    "provider": "openai",
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Analyze this dataset."}],
})

status = client.jobs.get(job["job_id"])
# status["status"]: "pending" | "running" | "completed" | "failed"

API Keys

key = client.api_keys.create()
keys = client.api_keys.list()
client.api_keys.revoke(key["id"])

Request Timeline

timeline = client.requests.timeline("request_id")

Error Handling

from keel_sdk import KeelClient, KeelError

try:
    client.permits.create(request)
except KeelError as e:
    print(e.status)        # HTTP status code
    print(e.code)          # e.g. "permit_denied"
    print(e.message)       # human-readable message
    print(e.field)         # field that caused the error, if any
    print(e.is_retryable)  # True for 408, 429, 500, 502, 503, 504
    print(e.retry_after)   # seconds from Retry-After header, or None

Automatic Retries

The SDK can automatically retry failed requests that return transient HTTP errors (408, 429, 500, 502, 503, 504) using exponential backoff with jitter.

from keel_sdk import KeelClient, RetryConfig

# Use default retry settings (3 retries, 0.5s initial delay, 2x backoff)
client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
    retry_config=RetryConfig(),
)

# Customize retry behavior
client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
    retry_config=RetryConfig(
        max_retries=5,
        initial_delay=1.0,
        max_delay=60.0,
        backoff_multiplier=3.0,
    ),
)

To disable retries (the default), omit retry_config or pass None:

client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
    retry_config=None,  # no automatic retries
)

When a response includes a Retry-After header, the SDK respects that value instead of the computed backoff delay. Streaming calls are never retried.

Per-Request Timeout

Every method that makes an HTTP call accepts an optional timeout parameter that overrides the client-level timeout for that single request:

# Client-level timeout is 30 s (the default)
client = KeelClient(base_url="https://api.keelapi.com", api_key="keel_sk_...", timeout=30.0)

# This specific call uses a 5 s timeout instead
permit = client.request("POST", "/v1/permits", json=body, timeout=5.0)

# _get and streaming helpers also accept timeout
timeline = client._get("/v1/requests/req_1/timeline", timeout=10.0)

Pass timeout=None (or omit it) to use the client-level default.

Context Manager

with KeelClient(base_url="...", api_key="...") as client:
    permit = client.permits.create({...})

# Async
async with KeelClient(base_url="...", api_key="...") as client:
    permit = await client.permits.create_async({...})

Freshness Headers

For replay-protected endpoints:

client = KeelClient(
    base_url="https://api.keelapi.com",
    api_key="keel_sk_...",
    request_freshness=True,
)

This adds X-Keel-Timestamp and X-Keel-Nonce to every request.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keel_sdk-0.2.4.tar.gz (38.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

keel_sdk-0.2.4-py3-none-any.whl (33.2 kB view details)

Uploaded Python 3

File details

Details for the file keel_sdk-0.2.4.tar.gz.

File metadata

  • Download URL: keel_sdk-0.2.4.tar.gz
  • Upload date:
  • Size: 38.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for keel_sdk-0.2.4.tar.gz
Algorithm Hash digest
SHA256 799dce2a447ca61cc25d77593b86bd582b93ae6d6f44abbe6f510becb31683c5
MD5 a17032633412f4b098e92f2127b683d9
BLAKE2b-256 aa593f62a2926dbf29f4ae3a4d70f7f2d41971052983510ae03c04051e3555f4

See more details on using hashes here.

File details

Details for the file keel_sdk-0.2.4-py3-none-any.whl.

File metadata

  • Download URL: keel_sdk-0.2.4-py3-none-any.whl
  • Upload date:
  • Size: 33.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for keel_sdk-0.2.4-py3-none-any.whl
Algorithm Hash digest
SHA256 53a9ec0afdd32c150edcc8beb0dcf330e7af68617eb0ccced1cc76a419766536
MD5 e7c76a92ab575ed80255a836d9fedf41
BLAKE2b-256 b439b8216567ac435994a54e47cda5c0de65820da1f746ea9754159b59313c64

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page