Skip to main content

Unofficial Python SDK for the ChatGPT Codex backend API

Project description

codex-backend-sdk

Unofficial Python SDK for the ChatGPT Codex backend API (chatgpt.com/backend-api/codex).

This is a lower-level alternative to the official Codex CLI/SDK. It gives direct access to the underlying HTTP API endpoints on which the CLI relies, so you can build your own agent loop from scratch without inheriting OpenAI's design choices.

Requirements: a ChatGPT Plus, Pro, or Enterprise subscription. No OpenAI API key and no Codex CLI installation needed — authentication goes through ChatGPT OAuth directly from Python.

[!WARNING] Disclaimer: This is an independent, community-maintained library that reverse-engineers undocumented endpoints of chatgpt.com. It is not affiliated with, endorsed by, or supported by OpenAI. Usage remains subject to OpenAI's Terms of Use. Endpoints may change or break without notice.


Installation

git clone https://github.com/B4PT0R/codex-backend-sdk.git
cd codex-backend-sdk
pip install -e .

For the agent example (examples/agent.py), also install:

pip install pyyaml

Authentication

from codex_backend_sdk import CodexClient

client = CodexClient().authenticate()

authenticate() handles everything automatically:

  • Tokens present and fresh → used directly, no network call
  • Tokens stale → silently refreshed in the background
  • No valid tokens available → opens your browser for the OAuth flow (blocking, first run only)

Tokens are saved to ~/.codex/auth.json (created if it doesn't exist). If the official Codex CLI is also installed, both share the same file.

All other methods (stream(), respond(), list_models(), …) raise immediately if authenticate() was not called — they never trigger the OAuth flow implicitly.


Basic usage

from codex_backend_sdk import CodexClient, TextDelta, ResponseCompleted

client = CodexClient().authenticate()

for event in client.stream("Explain quicksort in one paragraph"):
    if isinstance(event, TextDelta):
        print(event.text, end="", flush=True)
    elif isinstance(event, ResponseCompleted):
        print(f"\n[tokens: in={event.input_tokens} out={event.output_tokens}]")

Or collect the full response at once:

text, completion = client.respond("Explain quicksort in one paragraph")
print(text)

Aborting a stream

CodexClient.abort() stops the currently active streaming response. The stream iterator raises ResponseAborted, while captured output emitted before the abort remains available to your caller.

from codex_backend_sdk import CodexClient, ResponseAborted, TextDelta

client = CodexClient().authenticate()
events = client.stream("Write a long essay")

try:
    for event in events:
        if isinstance(event, TextDelta):
            print(event.text, end="", flush=True)
            if should_stop():
                client.abort()
except ResponseAborted:
    pass

Calling abort() when no stream is active is a no-op.


Models

models = client.list_models()
for m in models:
    print(m.slug, m.display_name, m.context_window)

info = client.get_model("codex-mini-latest")

Multi-turn conversation

Pass prior turns as conversation_history. Each turn is a raw dict in Responses API format.

history = []

def chat(user_input: str) -> str:
    text, _ = client.respond(user_input, conversation_history=history)
    history.append({
        "type": "message", "role": "user",
        "content": [{"type": "input_text", "text": user_input}],
    })
    history.append({
        "type": "message", "role": "assistant",
        "content": [{"type": "output_text", "text": text}],
    })
    return text

print(chat("My name is Alice."))
print(chat("What's my name?"))  # model remembers

Tool use

Tool definitions follow the same format as the official OpenAI SDK.

import json
from codex_backend_sdk import CodexClient, TextDelta, ToolCall, ResponseCompleted

client = CodexClient().authenticate()

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current temperature for a city.",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"},
            },
            "required": ["city"],
        },
    }
]

def get_weather(city: str) -> dict:
    return {"city": city, "temperature": 22, "unit": "celsius"}


history = []

# Turn 1 — model may emit a ToolCall
for event in client.stream("What's the weather in Paris?", tools=tools):
    if isinstance(event, TextDelta):
        print(event.text, end="", flush=True)
    elif isinstance(event, ToolCall):
        result = get_weather(**event.parsed_arguments())
        history.append(event.as_history_item())           # the function_call item
        history.append(event.to_tool_result(json.dumps(result)))  # function_call_output

# Turn 2 — model sees the tool result and replies
for event in client.stream(None, conversation_history=history, tools=tools):
    if isinstance(event, TextDelta):
        print(event.text, end="", flush=True)
    elif isinstance(event, ResponseCompleted):
        print()

tool_choice defaults to "auto". Other values: "none", "required", or {"type": "function", "name": "..."}.


Image input

from codex_backend_sdk import image_url, image_b64

# From a URL
for event in client.stream(
    ["What's in this image?", image_url("https://example.com/photo.jpg")]
):
    ...

# From a local file (base64)
import base64
with open("photo.jpg", "rb") as f:
    data = base64.b64encode(f.read()).decode()

for event in client.stream(
    ["Describe this image.", image_b64(data, "image/jpeg")]
):
    ...

Reasoning

# Enable chain-of-thought (reasoning tokens are billed separately)
for event in client.stream(
    "Solve: if 3x + 7 = 22, what is x?",
    reasoning="medium",
    reasoning_summary="concise",   # "concise" | "detailed" | "auto"
):
    ...

reasoning values: "minimal", "low", "medium", "high", "xhigh".


Structured output (JSON Schema)

import json

schema = {
    "title": "person",
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age":  {"type": "integer"},
    },
    "required": ["name", "age"],
    "additionalProperties": False,
}

text, _ = client.respond(
    "Extract: Alice is 30 years old.",
    output_schema=schema,
)
person = json.loads(text)
print(person)  # {"name": "Alice", "age": 30}

Context compaction

When a conversation grows long, compact it into an encrypted summary the model can still read:

result = client.compact(history)

# result.output_items replaces the full history
for event in client.stream("Continue…", conversation_history=result.output_items):
    ...

Usage / quota

quota = client.usage()
print(quota)  # raw dict from /backend-api/wham/usage

Stream events reference

Type Emitted when
TextDelta incremental text chunk arrives
ReasoningDelta reasoning summary chunk (requires include_reasoning=True)
ToolCall model requests a function call
OutputItem a non-tool output item completes (message, compaction_summary, …)
ResponseCompleted stream ends successfully; carries full TokenUsage
ResponseFailed stream ends with an error (code, message)
ResponseAborted exception raised when the caller aborts an active stream

stream() parameters

Parameter Type Default Description
user_message str | list[dict] | None User prompt. None to continue without a new message (e.g. after tool results).
model str "gpt-5.4" Model slug.
instructions str "" System instructions.
conversation_history list[dict] None Prior turns (ResponseItem format).
tools list[dict] None Tool definitions (OpenAI function format).
tool_choice str | dict "auto" "auto", "none", "required", or {"type":"function","name":"..."}.
parallel_tool_calls bool False Allow multiple tool calls in one turn.
reasoning str None "minimal" / "low" / "medium" / "high" / "xhigh".
reasoning_summary str None "concise" / "detailed" / "auto".
verbosity str None "low" / "medium" / "high". Mutually exclusive with output_schema.
output_schema dict None JSON Schema for structured output.
include_reasoning bool False Emit ReasoningDelta events.
web_search str None "cached" (OpenAI index), "live" (real-time fetch), or "disabled" / None (off). Incompatible with reasoning="minimal".
store bool False Persist the response server-side.
service_tier str None "flex" (higher throughput) or "fast" (lower latency).
prompt_cache_key str None UUID to share across calls with a common prefix — hits the server-side prompt cache.

How it works

Authentication uses the same ChatGPT OAuth 2.0 + PKCE flow as the official Codex CLI (codex-rs). Tokens are stored in ~/.codex/auth.json and can be refreshed without re-opening the browser.

The backend (chatgpt.com/backend-api/codex) is distinct from api.openai.com and is accessed via your ChatGPT subscription rather than an API key. All responses are streamed over SSE — the backend does not support non-streaming requests.

Realtime WebRTC

/backend-api/codex/realtime/calls currently returns 404, but the WebRTC call creation route used by Codex works at https://api.openai.com/v1/realtime/calls with the saved ChatGPT OAuth bearer token and ChatGPT-Account-ID.

Generate an SDP offer in a browser or WebView with an audio transceiver and the oai-events data channel, then create the call:

result = client.create_realtime_call(
    offer_sdp,
    instructions="You are concise.",
    session_id="thread-or-session-id",
)

print(result.answer_sdp)
print(result.call_id)
print(result.sideband_url)

This path intentionally does not read OPENAI_API_KEY; it uses the ChatGPT auth tokens loaded from ~/.codex/auth.json.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codex_backend_sdk-0.1.1.tar.gz (33.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

codex_backend_sdk-0.1.1-py3-none-any.whl (23.4 kB view details)

Uploaded Python 3

File details

Details for the file codex_backend_sdk-0.1.1.tar.gz.

File metadata

  • Download URL: codex_backend_sdk-0.1.1.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for codex_backend_sdk-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6cf091d89465df88a2839253ed563eeb1a933ac728630ca58a349c8cbf8b9a7e
MD5 feced1aa97616df533c8da72f4a3bc6e
BLAKE2b-256 e02f225a303f31c54afe74c3ef6369804be3892df246d5046ac8cc085ebc28c5

See more details on using hashes here.

File details

Details for the file codex_backend_sdk-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for codex_backend_sdk-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8d0123cbf9d01c6945a339cef0582847bd54b030e0880516b287a4a1e29da313
MD5 4fda301d96cd6426a7a182be94951b4b
BLAKE2b-256 648e94034afb303b653c54a2f56f6663579d049fe38dd32ffab433c094ca7015

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page