Skip to main content

Official Python SDK for Octoryn LLM (Octopus Core Pty Ltd)

Project description

octoryn-llm

Octoryn LLM Python SDK — OpenAI-compatible client for Octoryn Gateway.

octoryn-llm ships an Octoryn / AsyncOctoryn client that talks to the Octoryn Gateway (default: https://api.octopusos.dev/v1). The chat / images / audio / embeddings / moderations surface is byte-for-byte OpenAI-compatible, plus first-class extensions for audit, usage, and BYOK.

Install

pip install octoryn-llm

Python 3.10+.

Quickstart

from octoryn import Octoryn

client = Octoryn(api_key="oct_live_...")  # or set OCTORYN_API_KEY
out = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello, Octoryn"}],
)
print(out.choices[0].message.content)

Authentication

API key (recommended)

Get a key in the web console at https://app.octopusos.dev, then set OCTORYN_API_KEY in your environment:

import os
from octoryn import Octoryn

client = Octoryn(api_key=os.environ["OCTORYN_API_KEY"])

OIDC (refresh-token)

For human / SSO flows the SDK ships an OAuth2 refresh-token credential. It auto-refreshes the short-lived access token (single-flight, thread- and asyncio-safe) and lets you persist the new token-set with on_refresh:

import json, pathlib
from octoryn import Octoryn, OctorynOidcCredential

TOKEN_PATH = pathlib.Path.home() / ".octoryn" / "tokens.json"

def persist(token_set: dict) -> None:
    TOKEN_PATH.parent.mkdir(parents=True, exist_ok=True)
    TOKEN_PATH.write_text(json.dumps(token_set))

cached = json.loads(TOKEN_PATH.read_text()) if TOKEN_PATH.exists() else {}

cred = OctorynOidcCredential(
    identity_url="https://identity.octopusos.dev",
    client_id="cli",
    refresh_token=cached.get("refresh_token", "rt_..."),
    access_token=cached.get("access_token"),
    expires_at=cached.get("expires_at"),
    on_refresh=persist,
)
client = Octoryn(credential=cred)

Pass either api_key= or credential=, never both.

Use the official openai SDK (drop-in)

The Octoryn Gateway is bytewise OpenAI-compatible for chat, images, audio, embeddings, and moderations. Point the official openai SDK at the Octoryn base URL and existing code keeps working:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.octopusos.dev/v1",
    api_key="oct_live_...",
)
client.chat.completions.create(model="gpt-4o-mini", messages=[...])

If you'd rather import from octoryn while keeping the OpenAI surface, use the bundled alias:

from octoryn.openai_compat import OpenAI  # subclass of Octoryn

Modality cookbook

Chat

out = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarize the OSI model in one sentence."}],
)
print(out.choices[0].message.content)

# Stream
stream = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Count 1..5"}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)

Images

res = client.images.generate(
    model="dall-e-3",
    prompt="a small octopus drawing a vector logo, flat illustration",
    size="1024x1024",
)
print(res.data[0].url)

Speech (TTS)

speech = client.audio.speech.create(
    model="tts-1",
    voice="alloy",
    input="Octoryn is online.",
    response_format="mp3",
)
with open("out.mp3", "wb") as f:
    f.write(speech.content)

Transcription (STT)

tx = client.audio.transcriptions.create(
    file="meeting.mp3",      # path, bytes, or file-like
    model="whisper-1",
    language="en",
)
print(tx.text)

Video

job = client.videos.generate_and_wait(
    model="veo-3",
    prompt="a slow-motion octopus opening a jar underwater",
    duration_seconds=5,
    resolution="720p",
    aspect_ratio="16:9",
)
print(job.status, job.video_url)

Realtime

session = client.realtime.sessions.create(
    provider="openai",
    model="gpt-4o-realtime-preview",
    voice="verse",
)
print(session.ws_url, session.audio_sample_rate)

The SDK does not bundle a WebRTC/WebSocket client. Connect to ws_url with your transport of choice (e.g. websockets, aiortc) and exchange events following the underlying provider's realtime protocol:

# import websockets, asyncio, json
# async with websockets.connect(session.ws_url) as ws:
#     await ws.send(json.dumps({"type": "input_audio_buffer.append", "audio": "..."}))
#     async for msg in ws: print(msg)

Embeddings

emb = client.embeddings.create(
    model="text-embedding-3-small",
    input=["hello", "world"],
)
print(len(emb.data), len(emb.data[0].embedding))

Moderations

mod = client.moderations.create(input="I love clean code.")
print(mod["results"][0]["flagged"])

Models

print(client.models.list())

Async

import asyncio
from octoryn import AsyncOctoryn

async def main() -> None:
    async with AsyncOctoryn(api_key="oct_live_...") as client:
        out = await client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[{"role": "user", "content": "ping"}],
        )
        print(out.choices[0].message.content)

asyncio.run(main())

Errors & retries

Class When
OctorynAuthError 401 / 403 — bad / missing / unauthorized key
OctorynRateLimitedError 429 rate_limited — exposes retry_after_s
OctorynQuotaExceededError 429 quota_exceeded — exposes retry_at
OctorynKsiBlockedError 422 ksi_blocked — exposes reasons, channel
OctorynUpstreamError 503 upstream_* — exposes upstream
OctorynAPIStatusError Any other non-2xx HTTP
OctorynStreamInterrupted SSE stream terminated mid-flight
import time
from octoryn import OctorynRateLimitedError

try:
    client.chat.completions.create(model="gpt-4o-mini", messages=[...])
except OctorynRateLimitedError as e:
    time.sleep(e.retry_after_s or 1.0)

The client retries 5xx and transport errors up to max_retries=3 with exponential backoff (factor 0.5s). Override per-client with Octoryn(api_key=..., max_retries=5).

BYOK

Register an upstream credential the gateway should use on your behalf:

created = client.byok.create(
    upstream="openai",       # openai | anthropic | openrouter | ...
    api_key="sk-...",        # never re-displayed by the gateway
    label="prod",
)
print(created.id)

Audit & Usage

runs = client.audit.list(limit=20)
detail = client.audit.get(run_id=runs.items[0].run_id)
verified = client.audit.verify(run_id=detail.run_id)
summary = client.usage.get()
records = client.usage.records(model="gpt-4o-mini", limit=50)

Versioning & support

Octoryn follows semver from 1.0.0. Pre-1.0 minor bumps may break. See CHANGELOG.md (added in Phase 7) and https://octopusos.dev.

License

Proprietary © Octopus Core Pty Ltd (ACN 696 931 236). Octoryn™ is a trademark of Octopus Core Pty Ltd.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

octoryn_llm-0.1.0.tar.gz (65.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

octoryn_llm-0.1.0-py3-none-any.whl (31.4 kB view details)

Uploaded Python 3

File details

Details for the file octoryn_llm-0.1.0.tar.gz.

File metadata

  • Download URL: octoryn_llm-0.1.0.tar.gz
  • Upload date:
  • Size: 65.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for octoryn_llm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3950cfcb5420c43cfd14f9c171a85b40ac470787fa48ab4b6ada80d23e8649e0
MD5 5dc37ecf490ce1042836ded856dc5d3f
BLAKE2b-256 87eb5f042400bce7de42b91319cd03f76b419cd2901a3f8dfe1edd13155a9d46

See more details on using hashes here.

File details

Details for the file octoryn_llm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: octoryn_llm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 31.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for octoryn_llm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b2efa18b85a421f278c5313580413e43d932f8b5afeed634f58bc3c951d42ffa
MD5 2db0ecb76f348ec262fe186d77bc6740
BLAKE2b-256 ce9fc10e8ec84427384d87c9e02f449d2022b223dc3e98051e101ddabe9d6fa5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page