Official Python SDK for Octoryn LLM (Octopus Core Pty Ltd)
Project description
octoryn-llm
Octoryn LLM Python SDK — OpenAI-compatible client for Octoryn Gateway.
octoryn-llm ships an Octoryn / AsyncOctoryn client that talks to the
Octoryn Gateway (default: https://api.octopusos.dev/v1). The chat / images /
audio / embeddings / moderations surface is byte-for-byte OpenAI-compatible,
plus first-class extensions for audit, usage, and BYOK.
Install
pip install octoryn-llm
Python 3.10+.
Quickstart
from octoryn import Octoryn
client = Octoryn(api_key="oct_live_...") # or set OCTORYN_API_KEY
out = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello, Octoryn"}],
)
print(out.choices[0].message.content)
Authentication
API key (recommended)
Get a key in the web console at https://app.octopusos.dev, then set
OCTORYN_API_KEY in your environment:
import os
from octoryn import Octoryn
client = Octoryn(api_key=os.environ["OCTORYN_API_KEY"])
OIDC (refresh-token)
For human / SSO flows the SDK ships an OAuth2 refresh-token credential. It
auto-refreshes the short-lived access token (single-flight, thread- and
asyncio-safe) and lets you persist the new token-set with on_refresh:
import json, pathlib
from octoryn import Octoryn, OctorynOidcCredential
TOKEN_PATH = pathlib.Path.home() / ".octoryn" / "tokens.json"
def persist(token_set: dict) -> None:
TOKEN_PATH.parent.mkdir(parents=True, exist_ok=True)
TOKEN_PATH.write_text(json.dumps(token_set))
cached = json.loads(TOKEN_PATH.read_text()) if TOKEN_PATH.exists() else {}
cred = OctorynOidcCredential(
identity_url="https://identity.octopusos.dev",
client_id="cli",
refresh_token=cached.get("refresh_token", "rt_..."),
access_token=cached.get("access_token"),
expires_at=cached.get("expires_at"),
on_refresh=persist,
)
client = Octoryn(credential=cred)
Pass either api_key= or credential=, never both.
Use the official openai SDK (drop-in)
The Octoryn Gateway is bytewise OpenAI-compatible for chat, images, audio,
embeddings, and moderations. Point the official openai SDK at the Octoryn
base URL and existing code keeps working:
from openai import OpenAI
client = OpenAI(
base_url="https://api.octopusos.dev/v1",
api_key="oct_live_...",
)
client.chat.completions.create(model="gpt-4o-mini", messages=[...])
If you'd rather import from octoryn while keeping the OpenAI surface, use
the bundled alias:
from octoryn.openai_compat import OpenAI # subclass of Octoryn
Modality cookbook
Chat
out = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Summarize the OSI model in one sentence."}],
)
print(out.choices[0].message.content)
# Stream
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Count 1..5"}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
Images
res = client.images.generate(
model="dall-e-3",
prompt="a small octopus drawing a vector logo, flat illustration",
size="1024x1024",
)
print(res.data[0].url)
Speech (TTS)
speech = client.audio.speech.create(
model="tts-1",
voice="alloy",
input="Octoryn is online.",
response_format="mp3",
)
with open("out.mp3", "wb") as f:
f.write(speech.content)
Transcription (STT)
tx = client.audio.transcriptions.create(
file="meeting.mp3", # path, bytes, or file-like
model="whisper-1",
language="en",
)
print(tx.text)
Video
job = client.videos.generate_and_wait(
model="veo-3",
prompt="a slow-motion octopus opening a jar underwater",
duration_seconds=5,
resolution="720p",
aspect_ratio="16:9",
)
print(job.status, job.video_url)
Realtime
session = client.realtime.sessions.create(
provider="openai",
model="gpt-4o-realtime-preview",
voice="verse",
)
print(session.ws_url, session.audio_sample_rate)
The SDK does not bundle a WebRTC/WebSocket client. Connect to ws_url with
your transport of choice (e.g. websockets, aiortc) and exchange events
following the underlying provider's realtime protocol:
# import websockets, asyncio, json
# async with websockets.connect(session.ws_url) as ws:
# await ws.send(json.dumps({"type": "input_audio_buffer.append", "audio": "..."}))
# async for msg in ws: print(msg)
Embeddings
emb = client.embeddings.create(
model="text-embedding-3-small",
input=["hello", "world"],
)
print(len(emb.data), len(emb.data[0].embedding))
Moderations
mod = client.moderations.create(input="I love clean code.")
print(mod["results"][0]["flagged"])
Models
print(client.models.list())
Async
import asyncio
from octoryn import AsyncOctoryn
async def main() -> None:
async with AsyncOctoryn(api_key="oct_live_...") as client:
out = await client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "ping"}],
)
print(out.choices[0].message.content)
asyncio.run(main())
Errors & retries
| Class | When |
|---|---|
OctorynAuthError |
401 / 403 — bad / missing / unauthorized key |
OctorynRateLimitedError |
429 rate_limited — exposes retry_after_s |
OctorynQuotaExceededError |
429 quota_exceeded — exposes retry_at |
OctorynKsiBlockedError |
422 ksi_blocked — exposes reasons, channel |
OctorynUpstreamError |
503 upstream_* — exposes upstream |
OctorynAPIStatusError |
Any other non-2xx HTTP |
OctorynStreamInterrupted |
SSE stream terminated mid-flight |
import time
from octoryn import OctorynRateLimitedError
try:
client.chat.completions.create(model="gpt-4o-mini", messages=[...])
except OctorynRateLimitedError as e:
time.sleep(e.retry_after_s or 1.0)
The client retries 5xx and transport errors up to max_retries=3 with
exponential backoff (factor 0.5s). Override per-client with
Octoryn(api_key=..., max_retries=5).
BYOK
Register an upstream credential the gateway should use on your behalf:
created = client.byok.create(
upstream="openai", # openai | anthropic | openrouter | ...
api_key="sk-...", # never re-displayed by the gateway
label="prod",
)
print(created.id)
Audit & Usage
runs = client.audit.list(limit=20)
detail = client.audit.get(run_id=runs.items[0].run_id)
verified = client.audit.verify(run_id=detail.run_id)
summary = client.usage.get()
records = client.usage.records(model="gpt-4o-mini", limit=50)
Versioning & support
Octoryn follows semver from 1.0.0. Pre-1.0 minor bumps may break.
See CHANGELOG.md (added in Phase 7) and https://octopusos.dev.
License
Proprietary © Octopus Core Pty Ltd (ACN 696 931 236). Octoryn™ is a trademark of Octopus Core Pty Ltd.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file octoryn_llm-0.1.0.tar.gz.
File metadata
- Download URL: octoryn_llm-0.1.0.tar.gz
- Upload date:
- Size: 65.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3950cfcb5420c43cfd14f9c171a85b40ac470787fa48ab4b6ada80d23e8649e0
|
|
| MD5 |
5dc37ecf490ce1042836ded856dc5d3f
|
|
| BLAKE2b-256 |
87eb5f042400bce7de42b91319cd03f76b419cd2901a3f8dfe1edd13155a9d46
|
File details
Details for the file octoryn_llm-0.1.0-py3-none-any.whl.
File metadata
- Download URL: octoryn_llm-0.1.0-py3-none-any.whl
- Upload date:
- Size: 31.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2efa18b85a421f278c5313580413e43d932f8b5afeed634f58bc3c951d42ffa
|
|
| MD5 |
2db0ecb76f348ec262fe186d77bc6740
|
|
| BLAKE2b-256 |
ce9fc10e8ec84427384d87c9e02f449d2022b223dc3e98051e101ddabe9d6fa5
|