Local OpenAI-compatible HTTP proxy backed by Codex CLI

These details have not been verified by PyPI

Project description

codex-api-proxy

Local OpenAI-compatible HTTP proxy backed by local Codex credentials.

This project exposes a minimal /v1/chat/completions API for local automation. By default, requests are executed through codex exec --json --skip-git-repo-check --ignore-user-config --ignore-rules --sandbox read-only --ephemeral, using the local Codex installation and its existing authentication.

Safety

The proxy defaults to 127.0.0.1 and should not be exposed publicly. Any client with access can spend your local Codex quota and can ask Codex to inspect files that are available to the selected Codex sandbox and workspace.

Set CODEX_PROXY_API_KEY to require Authorization: Bearer <key> on API requests.

If you start with --host 0.0.0.0 or another non-loopback bind address without --api-key, codex-api-proxy prints a warning. Use a bearer token before exposing the service to anything other than a trusted local machine.

With the default exec engine, Codex subprocesses are launched with --ignore-user-config and --ignore-rules. This prevents proxy requests from loading user Codex config, MCP servers, plugins, skills, and rule files.

Codex subprocesses also use --sandbox read-only and --ephemeral by default. This keeps calls closer to one-shot model calls where the caller owns conversation context.

The experimental app-server engine uses Codex's long-lived app-server protocol to reduce process startup latency and stream assistant deltas. Each API request starts a fresh Codex thread and archives it after completion, so callers must continue sending full chat history in messages. The app-server process uses an isolated CODEX_HOME at ~/.codex-api-proxy/codex-home by default. codex-api-proxy symlinks only the current Codex auth.json into that isolated home, so the app-server worker can reuse the existing login while not seeing the current user's config.toml, MCP config, or plugins. The app-server process is also started with --disable apps, --disable plugins, --disable skill_mcp_dependency_install, and -c mcp_servers={}. To keep skills out of the model-visible prompt, codex-api-proxy generates a skills.config=[{name=...,enabled=false}] override for known system skills and locally discovered skill names. Each request uses an empty dynamicTools list, empty environments, approvalPolicy: never, sandbox: read-only, and ephemeral: true by default.

Install

pip3 install codex-api-proxy

For local development from this checkout:

python3 -m pip install -e '.[dev]'

Make targets are available for local build and release tasks:

make build-tools
make test
make build
make release-check
make publish VERSION=0.1.1

make publish VERSION=... first syncs that version into pyproject.toml and src/codex_api_proxy/__init__.py, then runs tests, builds the package, validates the generated artifacts, and uploads them to PyPI.

Run

Start in the background:

codex-api-proxy start

By default, the service listens on 127.0.0.1:8765. The default Codex working directory is an empty workspace at ~/.codex-api-proxy/workspace.

Bind to all interfaces:

codex-api-proxy start --host 0.0.0.0

Check status:

codex-api-proxy status

Show saved runtime settings:

codex-api-proxy status --verbose

Restart with the last successful start settings:

codex-api-proxy restart

Restart and override one setting:

codex-api-proxy restart --proxy=http://127.0.0.1:8118

Start with faster defaults:

codex-api-proxy start --fast

Start with experimental long-lived app-server workers:

codex-api-proxy start --engine app-server --workers 2

Start with an outbound proxy, faster defaults, and multiple app-server workers:

codex-api-proxy start --proxy=http://127.0.0.1:8118 --fast --engine app-server --workers 4

Stop:

codex-api-proxy stop

Run in the foreground for debugging:

codex-api-proxy start --foreground

Configuration

CLI options:

--host: bind host, default 127.0.0.1
--port: bind port, default 8765
--api-key: require bearer auth
--codex-bin: Codex executable, default codex
--proxy: proxy URL passed to Codex as http_proxy and https_proxy
--model: model passed to Codex
--engine: execution engine, exec or app-server, default exec
--workers: number of long-lived app-server workers, default 1
--max-queue-size: maximum queued app-server requests before returning 429, default 64
--queue-timeout-seconds: maximum time to wait for an app-server worker, default 30
--app-server-codex-home: isolated CODEX_HOME used by app-server workers, default ~/.codex-api-proxy/codex-home
--codex-config: Codex config override passed as -c key=value, repeatable
--ephemeral: run codex exec with --ephemeral, enabled by default
--fast: use fast defaults: --codex-config model_reasoning_effort="low"
--default-cwd: default Codex working directory, default ~/.codex-api-proxy/workspace
--allowed-root: allowed cwd root, repeatable, default --default-cwd
--timeout-seconds: per-request timeout, default 300
--max-concurrency: maximum concurrent Codex executions, default 1
--log-level: Uvicorn log level, one of debug, info, warning, or error, default info
--pid-file: daemon pid file, default ~/.codex-api-proxy/codex-api-proxy.pid
--log-file: daemon log file for start, default ~/.codex-api-proxy/codex-api-proxy.log
--state-file: daemon state file, default ~/.codex-api-proxy/codex-api-proxy.state.json

start prints the state file path and the effective startup parameters. The state file is written with 0600 permissions and is used by restart to reuse the previous start settings. If --api-key is used, the key is redacted in terminal output but stored in the state file so restart can reuse it.

Environment variables are also supported when running the FastAPI app directly:

CODEX_PROXY_HOST: bind host, default 127.0.0.1
CODEX_PROXY_PORT: bind port, default 8765
CODEX_PROXY_API_KEY: optional bearer token
CODEX_PROXY_CODEX_BIN: Codex executable, default codex
CODEX_PROXY_PROXY: proxy URL passed to Codex
CODEX_PROXY_MODEL: model passed to Codex
CODEX_PROXY_ENGINE: execution engine, exec or app-server, default exec
CODEX_PROXY_WORKERS: number of long-lived app-server workers, default 1
CODEX_PROXY_MAX_QUEUE_SIZE: maximum queued app-server requests, default 64
CODEX_PROXY_QUEUE_TIMEOUT_SECONDS: maximum time to wait for an app-server worker, default 30
CODEX_PROXY_APP_SERVER_CODEX_HOME: isolated CODEX_HOME used by app-server workers
CODEX_PROXY_CODEX_CONFIGS: ;;-separated Codex config overrides passed as repeated -c
CODEX_PROXY_EPHEMERAL: set to 1, true, or yes to run codex exec with --ephemeral; defaults to true
CODEX_PROXY_DEFAULT_CWD: default Codex working directory, default current directory
CODEX_PROXY_ALLOWED_ROOTS: colon-separated allowed cwd roots, default CODEX_PROXY_DEFAULT_CWD
CODEX_PROXY_TIMEOUT_SECONDS: per-request timeout, default 300
CODEX_PROXY_MAX_CONCURRENCY: maximum concurrent Codex executions, default 1
CODEX_PROXY_LOG_LEVEL: Uvicorn log level, default info

API

Health:

curl -sS http://127.0.0.1:8765/health

Models:

curl -sS http://127.0.0.1:8765/v1/models

Readiness:

curl -sS http://127.0.0.1:8765/ready

Local counters:

curl -sS http://127.0.0.1:8765/metrics

Chat completion:

curl -sS http://127.0.0.1:8765/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"codex-local","messages":[{"role":"user","content":"Reply with exactly: pong"}]}'

Streaming chat completion:

curl -N http://127.0.0.1:8765/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"codex-local","stream":true,"messages":[{"role":"user","content":"Reply with exactly: pong"}]}'

Streaming responses use OpenAI-compatible SSE events:

data: {"object":"chat.completion.chunk",...} for assistant chunks
data: [DONE] when the response is complete

With the default exec engine, the proxy streams at the HTTP protocol layer. The underlying Codex CLI currently provides the assistant answer through codex exec --json; if Codex only emits final assistant text for a request, the streamed content chunk will arrive after Codex completes.

With --engine app-server, the proxy maps Codex item/agentMessage/delta notifications to OpenAI-compatible SSE content chunks. This is experimental because Codex's app-server protocol is itself experimental.

Compatibility

codex-api-proxy is OpenAI-compatible for the local chat-completions shape, not a complete OpenAI API implementation.

Supported:

GET /v1/models
POST /v1/chat/completions
model
messages
stream
metadata.cwd for request-scoped working directory selection inside --allowed-root
OpenAI-compatible non-streaming response envelope
OpenAI-compatible SSE chunk envelope for streaming responses

Accepted but currently ignored:

temperature
top_p
max_tokens
presence_penalty
frequency_penalty

Not supported:

tools and tool_choice
response_format
n greater than one
stop
embeddings, responses, assistants, files, batches, audio, images, and other OpenAI endpoints
accurate token usage; the response currently returns zero token counts because Codex CLI does not expose stable token accounting through this path

The app-server engine starts a fresh Codex thread for each API request and archives it after completion. Callers must include the full chat history in messages; codex-api-proxy does not preserve conversation state between API requests.

OpenAI Python SDK smoke test:

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8765/v1", api_key="local-secret")

response = client.chat.completions.create(
    model="codex-local",
    messages=[{"role": "user", "content": "Reply with exactly: pong"}],
)
print(response.choices[0].message.content)

When no --api-key is configured, most OpenAI SDKs still require a placeholder api_key; any non-empty value is fine.

Operations

Use /health for a lightweight process check and /ready for a readiness check that includes the selected engine and Codex executable availability. Use /metrics for local JSON counters:

requests_total
requests_ok
requests_error
errors_by_status
engine
uptime_seconds
app_server_pool_started

Daemon logs are written to ~/.codex-api-proxy/codex-api-proxy.log by default. codex-api-proxy does not rotate logs itself; use your OS log rotation mechanism if you run it long-term.

Latency logs:

Each chat completion writes a single-line JSON log with logger codex_api_proxy.latency and event chat_completion_latency. Streaming responses also write chat_completion_first_sse when the first SSE chunk is yielded.

For background daemon runs, inspect:

rg 'codex_api_proxy.latency|chat_completion_latency|chat_completion_first_sse' ~/.codex-api-proxy/codex-api-proxy.log

Important fields:

request_id: correlates latency lines for the same request
stream: whether the request used stream: true
engine: exec or app-server
phases_ms.cwd_resolve: cwd validation time
phases_ms.prompt_build: OpenAI messages to Codex prompt conversion time
phases_ms.queue_wait: time waiting for local admission before engine execution
phases_ms.codex_exec: time spent inside codex exec
phases_ms.app_server_exec: time spent inside the app-server worker turn
phases_ms.codex_command_build: Codex command construction time
phases_ms.codex_process_spawn: local subprocess spawn time
phases_ms.codex_stdin_write: prompt write and stdin close time
phases_ms.codex_first_stdout_event: elapsed time from Codex IO start until the first non-empty stdout JSONL line
phases_ms.codex_first_assistant_event: elapsed time from Codex IO start until the first assistant message event
phases_ms.codex_stdout_read: total time spent reading Codex stdout until EOF
phases_ms.codex_process_wait: time waiting for the Codex process after stdout EOF
phases_ms.codex_communicate: total Codex subprocess IO time
phases_ms.codex_output_parse: Codex JSONL final-message parse time
phases_ms.response_build: response object/SSE setup time
phases_ms.total: total server-side request time before response is ready
time_to_first_sse_ms: stream request time until the first SSE chunk is yielded
time_to_first_content_sse_ms: app-server stream request time until the first content chunk is yielded

With auth:

curl -sS http://127.0.0.1:8765/v1/chat/completions \
  -H 'Authorization: Bearer local-secret' \
  -H 'Content-Type: application/json' \
  -d '{"model":"codex-local","messages":[{"role":"user","content":"Reply with exactly: pong"}]}'

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.2

Jun 17, 2026

0.1.1

Jun 16, 2026

This version

0.1.0

Jun 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codex_api_proxy-0.1.0.tar.gz (36.3 kB view details)

Uploaded Jun 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codex_api_proxy-0.1.0-py3-none-any.whl (25.6 kB view details)

Uploaded Jun 16, 2026 Python 3

File details

Details for the file codex_api_proxy-0.1.0.tar.gz.

File metadata

Download URL: codex_api_proxy-0.1.0.tar.gz
Upload date: Jun 16, 2026
Size: 36.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for codex_api_proxy-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`5aa1974f236715bbf9295e93f7ea217c0bbb1e3dbdb19b4e845ffe096cf36bb7`
MD5	`560b6bc7a9dd17376dae9fe1c53dd1a0`
BLAKE2b-256	`f7dbe61172495c5bc0e83a035cbc4e27bbbca5d10af4b2eeda0659540cf3c4da`

See more details on using hashes here.

File details

Details for the file codex_api_proxy-0.1.0-py3-none-any.whl.

File metadata

Download URL: codex_api_proxy-0.1.0-py3-none-any.whl
Upload date: Jun 16, 2026
Size: 25.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for codex_api_proxy-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4f2e16fe0b6db49946554a2382c532de5ce85e27c9bf4a03afd5ad23b39d4e6b`
MD5	`b162e5761e4b94824610eeb2d0203838`
BLAKE2b-256	`963efbb1f3b7a2db4552fc80eaf5bcf680693c13f3a44bfa8260c4257a56293f`

See more details on using hashes here.

codex-api-proxy 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

codex-api-proxy

Safety

Install

Run

Configuration

API

Compatibility

Operations

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes