Python client for the otari gateway

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mzai njbrake

These details have not been verified by PyPI

Project links

Documentation

Project description

otari logo

Otari Python Client SDK

Python 3.11+

Python client for otari, the open-source core that powers otari.ai. Communicate with any LLM provider through otari using a single, typed interface.

TypeScript SDK | Documentation | Platform (Beta)

New to otari? The otari repo explains what it is and why you’d use it.

Quickstart

pip install otari

Generate an API token at otari.ai/organization-settings/api-tokens, then add a provider key (e.g. OpenAI) at otari.ai/organization-settings/provider-keys so the gateway can route requests to that provider. Then use the client:

from otari import OtariClient

client = OtariClient(
    platform_token="tk_your_api_token",
)

response = client.completion(
    model="openai:gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

With no api_base, the client defaults to the hosted gateway at https://api.otari.ai. Change the model string to switch between LLM providers through the gateway.

Installation

Requirements

Python 3.11 or newer
A running otari instance (or the hosted gateway at otari.ai)

Install

pip install otari

Setting up credentials

For the hosted gateway, set your platform token (no api_base needed, it defaults to https://api.otari.ai):

export OTARI_AI_TOKEN="tk_your_api_token"

GATEWAY_PLATFORM_TOKEN is kept as a legacy alias for OTARI_AI_TOKEN; the canonical name takes precedence when both are set.

For a self-hosted gateway, set the base URL and an API key instead:

export GATEWAY_API_BASE="http://localhost:8000"
export GATEWAY_API_KEY="your-key-here"

Alternatively, pass credentials directly when creating the client (see Authentication).

Authentication

The client supports two authentication modes, matching the TypeScript SDK. When no explicit credentials are passed, the client auto-detects the mode from environment variables.

Platform mode (hosted)

Targets the hosted platform at otari.ai. The platform token is sent as a Bearer token in the standard Authorization header. Generate an API token at otari.ai/organization-settings/api-tokens and add a provider key (e.g. OpenAI) at otari.ai/organization-settings/provider-keys so the gateway can route requests to that provider. With no api_base, the client defaults to the hosted gateway at https://api.otari.ai:

from otari import OtariClient

client = OtariClient(
    platform_token="tk_your_api_token",
)

Set OTARI_AI_TOKEN (or the legacy alias GATEWAY_PLATFORM_TOKEN) and OtariClient() picks up the token automatically.

Self-hosted mode

Targets a gateway you run yourself. The API key is sent via the custom Otari-Key header, and an explicit api_base is required. Follow the setup in the otari repo, then point the SDK at your gateway:

from otari import OtariClient

client = OtariClient(
    api_base="http://localhost:8000",  # or wherever you host the gateway
    api_key="your-gateway-api-key",
)

Set GATEWAY_API_BASE and GATEWAY_API_KEY and OtariClient() picks them up automatically. Make sure your gateway has provider keys configured (e.g. OpenAI) so it can route requests upstream; see the otari repo for setup.

Environment variable quick reference

Variable	Mode	Purpose
`OTARI_AI_TOKEN`	Platform	Platform token, sent as `Authorization: Bearer …`.
`GATEWAY_PLATFORM_TOKEN`	Platform	Legacy alias for `OTARI_AI_TOKEN` (lower precedence).
`GATEWAY_API_BASE`	Self-hosted	Base URL of the gateway (required in self-hosted mode).
`GATEWAY_API_KEY`	Self-hosted	API key, sent via the `Otari-Key` header.
`GATEWAY_ADMIN_KEY`	Either	Admin/master key for the control-plane endpoints.

When no explicit credentials are provided, the client reads from these variables:

from otari import OtariClient

# Platform mode: OTARI_AI_TOKEN (or legacy GATEWAY_PLATFORM_TOKEN),
# defaulting to the hosted gateway.
# Self-hosted: GATEWAY_API_BASE + GATEWAY_API_KEY.
client = OtariClient()

Usage

Migrating from a previous version? OtariClient is now synchronous, call its methods directly (no await). For asynchronous code, switch to AsyncOtariClient, which keeps the previous await-based API. See Async usage.

Chat completions

response = client.completion(
    model="openai:gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

Streaming

stream = client.completion(
    model="openai:gpt-4o-mini",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

Responses API

response = client.response(
    model="openai:gpt-4o-mini",
    input="Summarize this in one sentence.",
)

print(response.output_text)

Messages API

The gateway's /messages endpoint (Anthropic message shape) is exposed via message(...). max_tokens is required. Set stream=True to iterate raw message-stream event dicts.

message = client.message(
    model="anthropic:claude-3-5-sonnet",
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=256,
)

print(message.content)

Embeddings

result = client.embedding(
    model="openai:text-embedding-3-small",
    input="Hello world",
)

print(result.data[0].embedding)

Listing models

models = client.list_models()
for model in models:
    print(model.id)

Moderation

result = client.moderation(
    model="openai:omni-moderation-latest",
    input="Some text to classify.",
)

print(result.results[0].flagged)

Reranking

result = client.rerank(
    model="cohere:rerank-v3.5",
    query="What is the capital of France?",
    documents=["Paris is the capital of France.", "Berlin is in Germany."],
)

for item in result.results:
    print(item.index, item.relevance_score)

Image generation

result = client.image_generation(
    model="openai:dall-e-3",
    prompt="A watercolor fox in a misty forest",
)

print(result.data[0].url)

The gateway returns a typed OpenAI-compatible ImagesResponse.

Audio

Text to speech returns the raw audio bytes:

audio = client.speech(
    model="openai:tts-1",
    input="Hello from otari.",
    voice="alloy",
)
Path("speech.mp3").write_bytes(audio)

Transcription uploads audio bytes and returns the parsed response:

result = client.transcription(
    model="openai:whisper-1",
    file=Path("speech.mp3").read_bytes(),
)
print(result.json["text"])

Batch operations

Submit many requests as a single batch job, poll for status, then fetch results once the batch completes. Batch endpoints are scoped to a provider.

batch = client.create_batch(
    {
        "model": "openai:gpt-4o-mini",
        "requests": [
            {
                "custom_id": "req-1",
                "body": {
                    "model": "openai:gpt-4o-mini",
                    "messages": [{"role": "user", "content": "Hello!"}],
                },
            },
        ],
        "completion_window": "24h",
    }
)

# Poll for status.
status = client.retrieve_batch(batch.id, provider="openai")

# List batches for a provider.
batches = client.list_batches("openai", {"limit": 20})

# Fetch results once complete (raises BatchNotCompleteError on HTTP 409).
results = client.retrieve_batch_results(batch.id, provider="openai")
for item in results.results:
    print(item.custom_id, item.result)

# Cancel a running batch.
client.cancel_batch(batch.id, provider="openai")

Error handling

In platform mode, HTTP errors are mapped to typed exceptions:

from otari import OtariClient, AuthenticationError, RateLimitError

try:
    response = client.completion(
        model="openai:gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )
except AuthenticationError as e:
    print(f"Invalid credentials: {e.message}")
except RateLimitError as e:
    print(f"Rate limited, retry after: {e.retry_after}")

HTTP Status	Error Class	Description
400 (capability)	`UnsupportedCapabilityError`	Selected provider does not support the requested capability
401, 403	`AuthenticationError`	Invalid or missing credentials
402	`InsufficientFundsError`	Budget or credits exhausted
404	`ModelNotFoundError`	Model not found, or no provider key configured for the requested provider. The exception's `message` carries the gateway's detail.
409	`BatchNotCompleteError`	Batch results requested before the batch finished
429	`RateLimitError`	Rate limit exceeded (includes `retry_after`)
502	`UpstreamProviderError`	Upstream provider unreachable
504	`GatewayTimeoutError`	Gateway timed out waiting for provider

UnsupportedCapabilityError surfaces in both platform and non-platform modes; the other mappings are platform-mode only.

Async usage

Every method on OtariClient has an asynchronous counterpart on AsyncOtariClient. It accepts the same constructor arguments and exposes the same methods, but they are coroutines you await (and streams are async iterables):

import asyncio

from otari import AsyncOtariClient


async def main() -> None:
    async with AsyncOtariClient(platform_token="tk_your_api_token") as client:
        response = await client.completion(
            model="openai:gpt-4o-mini",
            messages=[{"role": "user", "content": "Hello!"}],
        )
        print(response.choices[0].message.content)

        stream = await client.completion(
            model="openai:gpt-4o-mini",
            messages=[{"role": "user", "content": "Tell me a story."}],
            stream=True,
        )
        async for chunk in stream:
            content = chunk.choices[0].delta.content
            if content:
                print(content, end="", flush=True)


asyncio.run(main())

Context manager

The client supports a context manager for automatic cleanup:

with OtariClient(api_base="http://localhost:8000") as client:
    response = client.completion(
        model="openai:gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )

AsyncOtariClient supports the async equivalent:

async with AsyncOtariClient(api_base="http://localhost:8000") as client:
    response = await client.completion(
        model="openai:gpt-4o-mini",
        messages=[{"role": "user", "content": "Hello!"}],
    )

Development

# Create a virtual environment
python -m venv .venv
source .venv/bin/activate

# Install with dev dependencies
pip install -e ".[dev]"

# Run unit tests
pytest tests/

# Lint
ruff check src/ tests/

# Type-check
mypy src/

Documentation

Full Documentation - Complete guides and API reference
Supported Providers - List of all supported LLM providers
Gateway Documentation - Gateway setup and deployment
TypeScript SDK - The TypeScript SDK for Node.js applications
otari Platform (Beta) - Hosted control plane for key management, usage tracking, and cost visibility

Contributing

We welcome contributions from developers of all skill levels! Please see the Contributing Guide or open an issue to discuss changes.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mzai njbrake

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

This version

0.2.0

Jun 16, 2026

0.1.1

Jun 12, 2026

0.1.0

Jun 10, 2026

0.0.3

Jun 2, 2026

0.0.2

Jun 2, 2026

0.0.1

Apr 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

otari-0.2.0.tar.gz (130.7 kB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

otari-0.2.0-py3-none-any.whl (365.6 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file otari-0.2.0.tar.gz.

File metadata

Download URL: otari-0.2.0.tar.gz
Upload date: Jun 16, 2026
Size: 130.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for otari-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`0e434232aeca64dc95c8658698363ab1facdd1304c189041bf3e1fc3b8d7c6c5`
MD5	`09782a4138e186909523e862dc04990a`
BLAKE2b-256	`04e4a4425f159b7c56b529c2a733e2aa59ef0de4f3e982b367236f7edfccbca1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for otari-0.2.0.tar.gz:

Publisher: release-please.yml on mozilla-ai/otari-sdk-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: otari-0.2.0.tar.gz
- Subject digest: 0e434232aeca64dc95c8658698363ab1facdd1304c189041bf3e1fc3b8d7c6c5
- Sigstore transparency entry: 1841171293
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: mozilla-ai/otari-sdk-python@05c045f0196b0a7f680ca8fc85e545b6c96e8675
- Branch / Tag: refs/heads/main
- Owner: https://github.com/mozilla-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@05c045f0196b0a7f680ca8fc85e545b6c96e8675
- Trigger Event: push

File details

Details for the file otari-0.2.0-py3-none-any.whl.

File metadata

Download URL: otari-0.2.0-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 365.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for otari-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`794e41207e4ef901ed81709a73aa500c9549372c99ae4a27aed7e45cc0c5dead`
MD5	`a3483d405929a52afa8265ea790800ce`
BLAKE2b-256	`1372499032f271375593906e263fbbe2605e27372bb6f81dbc7c11607040d522`

See more details on using hashes here.

Provenance

The following attestation bundles were made for otari-0.2.0-py3-none-any.whl:

Publisher: release-please.yml on mozilla-ai/otari-sdk-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: otari-0.2.0-py3-none-any.whl
- Subject digest: 794e41207e4ef901ed81709a73aa500c9549372c99ae4a27aed7e45cc0c5dead
- Sigstore transparency entry: 1841171309
- Sigstore integration time: Jun 16, 2026
Source repository:
- Permalink: mozilla-ai/otari-sdk-python@05c045f0196b0a7f680ca8fc85e545b6c96e8675
- Branch / Tag: refs/heads/main
- Owner: https://github.com/mozilla-ai
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release-please.yml@05c045f0196b0a7f680ca8fc85e545b6c96e8675
- Trigger Event: push

otari 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Otari Python Client SDK

Quickstart

Installation

Requirements

Install

Setting up credentials

Authentication

Usage

Chat completions

Streaming

Responses API

Messages API

Embeddings

Listing models

Moderation

Reranking

Image generation

Audio

Batch operations

Error handling

Async usage

Context manager

Development

Documentation

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance