Unofficial Python SDK for the ChatGPT Codex backend API

These details have not been verified by PyPI

Project description

codex-backend-sdk

Unofficial Python SDK for the ChatGPT Codex backend API (chatgpt.com/backend-api/codex).

This is a lower-level alternative to the official Codex CLI/SDK. It gives direct access to the underlying HTTP API endpoints on which the CLI relies, so you can build your own agent loop from scratch without inheriting OpenAI's design choices.

Requirements: a ChatGPT Plus, Pro, or Enterprise subscription. No OpenAI API key and no Codex CLI installation needed — authentication goes through ChatGPT OAuth directly from Python.

[!WARNING] Disclaimer: This is an independent, community-maintained library that reverse-engineers undocumented endpoints of chatgpt.com. It is not affiliated with, endorsed by, or supported by OpenAI. Usage remains subject to OpenAI's Terms of Use. Endpoints may change or break without notice.

Installation

git clone https://github.com/B4PT0R/codex-backend-sdk.git
cd codex-backend-sdk
pip install -e .

For the agent example (examples/agent.py), also install:

pip install pyyaml

Authentication

from codex_backend_sdk import CodexClient

client = CodexClient().authenticate()

authenticate() handles everything automatically:

Tokens present and fresh → used directly, no network call
Tokens stale → silently refreshed in the background
No valid tokens available → opens your browser for the OAuth flow (blocking, first run only)

Tokens are saved to ~/.codex/auth.json (created if it doesn't exist). If the official Codex CLI is also installed, both share the same file.

All other methods (stream(), respond(), list_models(), …) raise immediately if authenticate() was not called — they never trigger the OAuth flow implicitly.

Basic usage

from codex_backend_sdk import CodexClient, TextDelta, ResponseCompleted

client = CodexClient().authenticate()

for event in client.stream("Explain quicksort in one paragraph"):
    if isinstance(event, TextDelta):
        print(event.text, end="", flush=True)
    elif isinstance(event, ResponseCompleted):
        print(f"\n[tokens: in={event.input_tokens} out={event.output_tokens}]")

Or collect the full response at once:

text, completion = client.respond("Explain quicksort in one paragraph")
print(text)

Aborting a stream

CodexClient.abort() stops the currently active streaming response. The stream iterator raises ResponseAborted, while captured output emitted before the abort remains available to your caller.

from codex_backend_sdk import CodexClient, ResponseAborted, TextDelta

client = CodexClient().authenticate()
events = client.stream("Write a long essay")

try:
    for event in events:
        if isinstance(event, TextDelta):
            print(event.text, end="", flush=True)
            if should_stop():
                client.abort()
except ResponseAborted:
    pass

Calling abort() when no stream is active is a no-op.

Models

models = client.list_models()
for m in models:
    print(m.slug, m.display_name, m.context_window)

info = client.get_model("codex-mini-latest")

Multi-turn conversation

Pass prior turns as conversation_history. Each turn is a raw dict in Responses API format.

history = []

def chat(user_input: str) -> str:
    text, _ = client.respond(user_input, conversation_history=history)
    history.append({
        "type": "message", "role": "user",
        "content": [{"type": "input_text", "text": user_input}],
    })
    history.append({
        "type": "message", "role": "assistant",
        "content": [{"type": "output_text", "text": text}],
    })
    return text

print(chat("My name is Alice."))
print(chat("What's my name?"))  # model remembers

Tool use

Tool definitions follow the same format as the official OpenAI SDK.

import json
from codex_backend_sdk import CodexClient, TextDelta, ToolCall, ResponseCompleted

client = CodexClient().authenticate()

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current temperature for a city.",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"},
            },
            "required": ["city"],
        },
    }
]

def get_weather(city: str) -> dict:
    return {"city": city, "temperature": 22, "unit": "celsius"}


history = []

# Turn 1 — model may emit a ToolCall
for event in client.stream("What's the weather in Paris?", tools=tools):
    if isinstance(event, TextDelta):
        print(event.text, end="", flush=True)
    elif isinstance(event, ToolCall):
        result = get_weather(**event.parsed_arguments())
        history.append(event.as_history_item())           # the function_call item
        history.append(event.to_tool_result(json.dumps(result)))  # function_call_output

# Turn 2 — model sees the tool result and replies
for event in client.stream(None, conversation_history=history, tools=tools):
    if isinstance(event, TextDelta):
        print(event.text, end="", flush=True)
    elif isinstance(event, ResponseCompleted):
        print()

tool_choice defaults to "auto". Other values: "none", "required", or {"type": "function", "name": "..."}.

Image input

from codex_backend_sdk import image_url, image_b64

# From a URL
for event in client.stream(
    ["What's in this image?", image_url("https://example.com/photo.jpg")]
):
    ...

# From a local file (base64)
import base64
with open("photo.jpg", "rb") as f:
    data = base64.b64encode(f.read()).decode()

for event in client.stream(
    ["Describe this image.", image_b64(data, "image/jpeg")]
):
    ...

Reasoning

# Enable chain-of-thought (reasoning tokens are billed separately)
for event in client.stream(
    "Solve: if 3x + 7 = 22, what is x?",
    reasoning="medium",
    reasoning_summary="concise",   # "concise" | "detailed" | "auto"
):
    ...

reasoning values: "minimal", "low", "medium", "high", "xhigh".

Structured output (JSON Schema)

import json

schema = {
    "title": "person",
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age":  {"type": "integer"},
    },
    "required": ["name", "age"],
    "additionalProperties": False,
}

text, _ = client.respond(
    "Extract: Alice is 30 years old.",
    output_schema=schema,
)
person = json.loads(text)
print(person)  # {"name": "Alice", "age": 30}

Context compaction

When a conversation grows long, compact it into an encrypted summary the model can still read:

result = client.compact(history)

# result.output_items replaces the full history
for event in client.stream("Continue…", conversation_history=result.output_items):
    ...

Usage / quota

quota = client.usage()
print(quota)  # raw dict from /backend-api/wham/usage

Stream events reference

Type	Emitted when
`TextDelta`	incremental text chunk arrives
`ReasoningDelta`	reasoning summary chunk (requires `include_reasoning=True`)
`ToolCall`	model requests a function call
`OutputItem`	a non-tool output item completes (message, compaction_summary, …)
`ResponseCompleted`	stream ends successfully; carries full `TokenUsage`
`ResponseFailed`	stream ends with an error (`code`, `message`)
`ResponseAborted`	exception raised when the caller aborts an active stream

`stream()` parameters

Parameter	Type	Default	Description
`user_message`	`str \| list[dict] \| None`	—	User prompt. `None` to continue without a new message (e.g. after tool results).
`model`	`str`	`"gpt-5.4"`	Model slug.
`instructions`	`str`	`""`	System instructions.
`conversation_history`	`list[dict]`	`None`	Prior turns (ResponseItem format).
`tools`	`list[dict]`	`None`	Tool definitions (OpenAI function format).
`tool_choice`	`str \| dict`	`"auto"`	`"auto"`, `"none"`, `"required"`, or `{"type":"function","name":"..."}`.
`parallel_tool_calls`	`bool`	`False`	Allow multiple tool calls in one turn.
`reasoning`	`str`	`None`	`"minimal"` / `"low"` / `"medium"` / `"high"` / `"xhigh"`.
`reasoning_summary`	`str`	`None`	`"concise"` / `"detailed"` / `"auto"`.
`verbosity`	`str`	`None`	`"low"` / `"medium"` / `"high"`. Mutually exclusive with `output_schema`.
`output_schema`	`dict`	`None`	JSON Schema for structured output.
`include_reasoning`	`bool`	`False`	Emit `ReasoningDelta` events.
`web_search`	`str`	`None`	`"cached"` (OpenAI index), `"live"` (real-time fetch), or `"disabled"` / `None` (off). Incompatible with `reasoning="minimal"`.
`store`	`bool`	`False`	Persist the response server-side.
`service_tier`	`str`	`None`	`"flex"` (higher throughput) or `"fast"` (lower latency).
`prompt_cache_key`	`str`	`None`	UUID to share across calls with a common prefix — hits the server-side prompt cache.

How it works

Authentication uses the same ChatGPT OAuth 2.0 + PKCE flow as the official Codex CLI (codex-rs). Tokens are stored in ~/.codex/auth.json and can be refreshed without re-opening the browser.

The backend (chatgpt.com/backend-api/codex) is distinct from api.openai.com and is accessed via your ChatGPT subscription rather than an API key. All responses are streamed over SSE — the backend does not support non-streaming requests.

Realtime WebRTC

/backend-api/codex/realtime/calls currently returns 404, but the WebRTC call creation route used by Codex works at https://api.openai.com/v1/realtime/calls with the saved ChatGPT OAuth bearer token and ChatGPT-Account-ID.

Generate an SDP offer in a browser or WebView with an audio transceiver and the oai-events data channel, then create the call:

result = client.create_realtime_call(
    offer_sdp,
    instructions="You are concise.",
    session_id="thread-or-session-id",
)

print(result.answer_sdp)
print(result.call_id)
print(result.sideband_url)

This path intentionally does not read OPENAI_API_KEY; it uses the ChatGPT auth tokens loaded from ~/.codex/auth.json.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.5

May 17, 2026

0.3.4

May 15, 2026

0.3.3

May 12, 2026

0.3.2

May 12, 2026

0.3.1

May 12, 2026

0.3.0

May 12, 2026

0.2.0

May 12, 2026

This version

0.1.1

May 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codex_backend_sdk-0.1.1.tar.gz (33.0 kB view details)

Uploaded May 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codex_backend_sdk-0.1.1-py3-none-any.whl (23.4 kB view details)

Uploaded May 3, 2026 Python 3

File details

Details for the file codex_backend_sdk-0.1.1.tar.gz.

File metadata

Download URL: codex_backend_sdk-0.1.1.tar.gz
Upload date: May 3, 2026
Size: 33.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for codex_backend_sdk-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`6cf091d89465df88a2839253ed563eeb1a933ac728630ca58a349c8cbf8b9a7e`
MD5	`feced1aa97616df533c8da72f4a3bc6e`
BLAKE2b-256	`e02f225a303f31c54afe74c3ef6369804be3892df246d5046ac8cc085ebc28c5`

See more details on using hashes here.

File details

Details for the file codex_backend_sdk-0.1.1-py3-none-any.whl.

File metadata

Download URL: codex_backend_sdk-0.1.1-py3-none-any.whl
Upload date: May 3, 2026
Size: 23.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for codex_backend_sdk-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8d0123cbf9d01c6945a339cef0582847bd54b030e0880516b287a4a1e29da313`
MD5	`4fda301d96cd6426a7a182be94951b4b`
BLAKE2b-256	`648e94034afb303b653c54a2f56f6663579d049fe38dd32ffab433c094ca7015`

See more details on using hashes here.

codex-backend-sdk 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

codex-backend-sdk

Installation

Authentication

Basic usage

Aborting a stream

Models

Multi-turn conversation

Tool use

Image input

Reasoning

Structured output (JSON Schema)

Context compaction

Usage / quota

Stream events reference

stream() parameters

How it works

Realtime WebRTC

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`stream()` parameters