Unofficial Python SDK for the ChatGPT Codex backend API
Project description
codex-backend-sdk
Unofficial Python SDK for the ChatGPT Codex backend API (chatgpt.com/backend-api/codex).
This is a lower-level alternative to the official Codex CLI/SDK. It gives direct access to the underlying HTTP API endpoints on which the CLI relies, so you can build your own agent loop from scratch without inheriting OpenAI's design choices.
Requirements: a ChatGPT Plus, Pro, or Enterprise subscription. No OpenAI API key and no Codex CLI installation needed — authentication goes through ChatGPT OAuth directly from Python.
[!WARNING] Disclaimer: This is an independent, community-maintained library that reverse-engineers undocumented endpoints of
chatgpt.com. It is not affiliated with, endorsed by, or supported by OpenAI. Usage remains subject to OpenAI's Terms of Use. Endpoints may change or break without notice.
Installation
git clone https://github.com/B4PT0R/codex-backend-sdk.git
cd codex-backend-sdk
pip install -e .
For the agent example (examples/agent.py), also install:
pip install pyyaml
Authentication
from codex_backend_sdk import CodexClient
client = CodexClient().authenticate()
authenticate() handles everything automatically:
- Tokens present and fresh → used directly, no network call
- Tokens stale → silently refreshed in the background
- No valid tokens available → opens your browser for the OAuth flow (blocking, first run only)
Tokens are saved to ~/.codex/auth.json (created if it doesn't exist). If the official Codex CLI is also installed, both share the same file.
All other methods (stream(), respond(), list_models(), …) raise immediately if authenticate() was not called — they never trigger the OAuth flow implicitly.
Basic usage
from codex_backend_sdk import CodexClient, TextDelta, ResponseCompleted
client = CodexClient().authenticate()
for event in client.stream("Explain quicksort in one paragraph"):
if isinstance(event, TextDelta):
print(event.text, end="", flush=True)
elif isinstance(event, ResponseCompleted):
print(f"\n[tokens: in={event.input_tokens} out={event.output_tokens}]")
Or collect the full response at once:
text, completion = client.respond("Explain quicksort in one paragraph")
print(text)
Aborting a stream
CodexClient.abort() stops the currently active streaming response. The stream
iterator raises ResponseAborted, while captured output emitted before the abort
remains available to your caller.
from codex_backend_sdk import CodexClient, ResponseAborted, TextDelta
client = CodexClient().authenticate()
events = client.stream("Write a long essay")
try:
for event in events:
if isinstance(event, TextDelta):
print(event.text, end="", flush=True)
if should_stop():
client.abort()
except ResponseAborted:
pass
Calling abort() when no stream is active is a no-op.
Models
models = client.list_models()
for m in models:
print(m.slug, m.display_name, m.context_window)
info = client.get_model("codex-mini-latest")
Multi-turn conversation
Pass prior turns as conversation_history. Each turn is a raw dict in Responses API format.
history = []
def chat(user_input: str) -> str:
text, _ = client.respond(user_input, conversation_history=history)
history.append({
"type": "message", "role": "user",
"content": [{"type": "input_text", "text": user_input}],
})
history.append({
"type": "message", "role": "assistant",
"content": [{"type": "output_text", "text": text}],
})
return text
print(chat("My name is Alice."))
print(chat("What's my name?")) # model remembers
Tool use
Tool definitions follow the same format as the official OpenAI SDK.
import json
from codex_backend_sdk import CodexClient, TextDelta, ToolCall, ResponseCompleted
client = CodexClient().authenticate()
tools = [
{
"type": "function",
"name": "get_weather",
"description": "Get the current temperature for a city.",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
},
"required": ["city"],
},
}
]
def get_weather(city: str) -> dict:
return {"city": city, "temperature": 22, "unit": "celsius"}
history = []
# Turn 1 — model may emit a ToolCall
for event in client.stream("What's the weather in Paris?", tools=tools):
if isinstance(event, TextDelta):
print(event.text, end="", flush=True)
elif isinstance(event, ToolCall):
result = get_weather(**event.parsed_arguments())
history.append(event.as_history_item()) # the function_call item
history.append(event.to_tool_result(json.dumps(result))) # function_call_output
# Turn 2 — model sees the tool result and replies
for event in client.stream(None, conversation_history=history, tools=tools):
if isinstance(event, TextDelta):
print(event.text, end="", flush=True)
elif isinstance(event, ResponseCompleted):
print()
tool_choice defaults to "auto". Other values: "none", "required", or {"type": "function", "name": "..."}.
Image input
from codex_backend_sdk import image_url, image_b64
# From a URL
for event in client.stream(
["What's in this image?", image_url("https://example.com/photo.jpg")]
):
...
# From a local file (base64)
import base64
with open("photo.jpg", "rb") as f:
data = base64.b64encode(f.read()).decode()
for event in client.stream(
["Describe this image.", image_b64(data, "image/jpeg")]
):
...
Reasoning
# Enable chain-of-thought (reasoning tokens are billed separately)
for event in client.stream(
"Solve: if 3x + 7 = 22, what is x?",
reasoning="medium",
reasoning_summary="concise", # "concise" | "detailed" | "auto"
):
...
reasoning values: "minimal", "low", "medium", "high", "xhigh".
Structured output (JSON Schema)
import json
schema = {
"title": "person",
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
},
"required": ["name", "age"],
"additionalProperties": False,
}
text, _ = client.respond(
"Extract: Alice is 30 years old.",
output_schema=schema,
)
person = json.loads(text)
print(person) # {"name": "Alice", "age": 30}
Context compaction
When a conversation grows long, compact it into an encrypted summary the model can still read:
result = client.compact(history)
# result.output_items replaces the full history
for event in client.stream("Continue…", conversation_history=result.output_items):
...
Usage / quota
quota = client.usage()
print(quota) # raw dict from /backend-api/wham/usage
Stream events reference
| Type | Emitted when |
|---|---|
TextDelta |
incremental text chunk arrives |
ReasoningDelta |
reasoning summary chunk (requires include_reasoning=True) |
ToolCall |
model requests a function call |
OutputItem |
a non-tool output item completes (message, compaction_summary, …) |
ResponseCompleted |
stream ends successfully; carries full TokenUsage |
ResponseFailed |
stream ends with an error (code, message) |
ResponseAborted |
exception raised when the caller aborts an active stream |
stream() parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
user_message |
str | list[dict] | None |
— | User prompt. None to continue without a new message (e.g. after tool results). |
model |
str |
"gpt-5.4" |
Model slug. |
instructions |
str |
"" |
System instructions. |
conversation_history |
list[dict] |
None |
Prior turns (ResponseItem format). |
tools |
list[dict] |
None |
Tool definitions (OpenAI function format). |
tool_choice |
str | dict |
"auto" |
"auto", "none", "required", or {"type":"function","name":"..."}. |
parallel_tool_calls |
bool |
False |
Allow multiple tool calls in one turn. |
reasoning |
str |
None |
"minimal" / "low" / "medium" / "high" / "xhigh". |
reasoning_summary |
str |
None |
"concise" / "detailed" / "auto". |
verbosity |
str |
None |
"low" / "medium" / "high". Mutually exclusive with output_schema. |
output_schema |
dict |
None |
JSON Schema for structured output. |
include_reasoning |
bool |
False |
Emit ReasoningDelta events. |
web_search |
str |
None |
"cached" (OpenAI index), "live" (real-time fetch), or "disabled" / None (off). Incompatible with reasoning="minimal". |
store |
bool |
False |
Persist the response server-side. |
service_tier |
str |
None |
"flex" (higher throughput) or "fast" (lower latency). |
prompt_cache_key |
str |
None |
UUID to share across calls with a common prefix — hits the server-side prompt cache. |
How it works
Authentication uses the same ChatGPT OAuth 2.0 + PKCE flow as the official Codex CLI (codex-rs). Tokens are stored in ~/.codex/auth.json and can be refreshed without re-opening the browser.
The backend (chatgpt.com/backend-api/codex) is distinct from api.openai.com and is accessed via your ChatGPT subscription rather than an API key. All responses are streamed over SSE — the backend does not support non-streaming requests.
Realtime WebRTC
/backend-api/codex/realtime/calls currently returns 404, but the WebRTC call
creation route used by Codex works at https://api.openai.com/v1/realtime/calls
with the saved ChatGPT OAuth bearer token and ChatGPT-Account-ID.
Generate an SDP offer in a browser or WebView with an audio transceiver and the
oai-events data channel, then create the call:
result = client.create_realtime_call(
offer_sdp,
instructions="You are concise.",
session_id="thread-or-session-id",
)
print(result.answer_sdp)
print(result.call_id)
print(result.sideband_url)
This path intentionally does not read OPENAI_API_KEY; it uses the ChatGPT auth
tokens loaded from ~/.codex/auth.json.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file codex_backend_sdk-0.1.1.tar.gz.
File metadata
- Download URL: codex_backend_sdk-0.1.1.tar.gz
- Upload date:
- Size: 33.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cf091d89465df88a2839253ed563eeb1a933ac728630ca58a349c8cbf8b9a7e
|
|
| MD5 |
feced1aa97616df533c8da72f4a3bc6e
|
|
| BLAKE2b-256 |
e02f225a303f31c54afe74c3ef6369804be3892df246d5046ac8cc085ebc28c5
|
File details
Details for the file codex_backend_sdk-0.1.1-py3-none-any.whl.
File metadata
- Download URL: codex_backend_sdk-0.1.1-py3-none-any.whl
- Upload date:
- Size: 23.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d0123cbf9d01c6945a339cef0582847bd54b030e0880516b287a4a1e29da313
|
|
| MD5 |
4fda301d96cd6426a7a182be94951b4b
|
|
| BLAKE2b-256 |
648e94034afb303b653c54a2f56f6663579d049fe38dd32ffab433c094ca7015
|