Skip to main content

Responses API ↔ Chat Completions translation bridge for Codex CLI

Project description

codex-relay

A lightweight Rust proxy that translates the OpenAI Responses API (used by Codex CLI) into the Chat Completions API, letting Codex work with any OpenAI-compatible provider — DeepSeek, Kimi, Qwen, Mistral, Groq, xAI, OpenRouter, and more.

Why

Codex CLI speaks the OpenAI Responses API, which is an OpenAI-proprietary stateful protocol. Every other provider exposes the standard Chat Completions API. codex-relay sits between Codex and your chosen provider, translating on the fly — no code changes to Codex required.

Install

# From PyPI — prebuilt binary for your platform
pip install codex-relay

# From crates.io
cargo install codex-relay

Quick start

1. Start the relay

CODEX_RELAY_UPSTREAM=https://api.deepseek.com/v1 \
CODEX_RELAY_API_KEY=$DEEPSEEK_API_KEY \
CODEX_RELAY_PORT=4446 \
codex-relay

On startup, the relay logs the available upstream models and prints a hint:

ℹ upstream models: deepseek-chat, deepseek-reasoner
⚠  To configure Codex with model metadata, run:  codex-relay --print-config --upstream ...

2. Generate your Codex config

codex-relay --print-config \
  --upstream https://api.deepseek.com/v1 \
  --api-key $DEEPSEEK_API_KEY

This prints a ready-to-use ~/.codex/config.toml snippet that includes model_properties for every upstream model, so Codex knows model capabilities and you won't see the "Model metadata … not found" warning.

If you prefer to write the config by hand, here is the minimal form:

model = "deepseek-chat"
model_provider = "deepseek-relay"

[model_providers.deepseek-relay]
name = "DeepSeek"
base_url = "http://127.0.0.1:4446/v1"
wire_api = "responses"
env_key = "DEEPSEEK_API_KEY"

[model_properties."deepseek-chat"]
context_window = 262144
max_context_window = 1048576
supports_parallel_tool_calls = true
supports_reasoning_summaries = false
input_modalities = ["text"]

⚠️ Without model_properties, Codex CLI defaults to fallback metadata for any model it doesn't recognize natively. This can degrade performance, tool-call reliability, and context-window management. The relay logs a reminder at startup and offers --print-config to eliminate this class of problem entirely.

3. Use Codex normally — it routes through the relay transparently.

CLI reference

Flag Env var Default Description
--port CODEX_RELAY_PORT 4444 Listen port
--upstream CODEX_RELAY_UPSTREAM https://openrouter.ai/api/v1 Upstream Chat Completions base URL
--api-key CODEX_RELAY_API_KEY (empty) API key forwarded to upstream
--model-map CODEX_RELAY_MODEL_MAP (empty) Comma-separated source:target model name translations
--print-config (none) Print a Codex config snippet with model_properties and exit
--session-ttl-hours CODEX_RELAY_SESSION_TTL_HOURS 168 Retain idle previous_response_id history and reasoning state for this many hours
--max-sessions CODEX_RELAY_MAX_SESSIONS 256 Maximum completed response histories retained for continuation
--max-session-memory-mb CODEX_RELAY_MAX_SESSION_MEMORY_MB 512 Approximate memory budget for retained session/reasoning state

Supported providers

Provider Base URL Suggested port
DeepSeek https://api.deepseek.com/v1 4446
Kimi (Moonshot) https://api.moonshot.cn/v1 4447
Qwen https://dashscope.aliyuncs.com/compatible-mode/v1 4448
Mistral https://api.mistral.ai/v1 4449
Groq https://api.groq.com/openai/v1 4450
xAI https://api.x.ai/v1 4451
OpenRouter https://openrouter.ai/api/v1 4452

Any OpenAI-compatible endpoint works.

Features

  • Streaming — full SSE streaming with correct event sequencing
  • Tool calls — accumulates streaming deltas and emits structured function_call items
  • Parallel tool calls — consecutive function_call input items merged into one assistant message
  • Reasoning models — preserves reasoning_content across turns (Kimi k2.6, DeepSeek-R1)
  • Model catalog — proxies /v1/models from the upstream provider
  • Auto-config--print-config generates a complete Codex config with model metadata

Configuration

Variable Default Description
CODEX_RELAY_PORT 4444 Port to listen on
CODEX_RELAY_UPSTREAM https://openrouter.ai/api/v1 Upstream Chat Completions base URL
CODEX_RELAY_API_KEY (empty) API key forwarded to upstream
CODEX_RELAY_MODEL_MAP (empty) Comma-separated source:target model name translations (e.g., gpt-5.4:deepseek-v4-pro)
CODEX_RELAY_TOOL_DENYLIST (empty) Comma-separated tool names to remove before forwarding tools to the upstream model
CODEX_RELAY_SESSION_TTL_HOURS 168 Retain idle session/reasoning state for this many hours
CODEX_RELAY_MAX_SESSIONS 256 Maximum completed response histories retained for previous_response_id
CODEX_RELAY_MAX_SESSION_MEMORY_MB 512 Approximate memory budget for retained session/reasoning state
RUST_LOG codex_relay=info Log verbosity

Python API

from codex_relay import start

proc = start(port=4446, upstream="https://api.deepseek.com/v1", api_key="sk-...")
# ... use Codex ...
proc.terminate()

Testing

Two layers — offline tests pin behavior against captured Codex wire-shape; live tests pin behavior against real provider APIs.

Debugging tool round-trips

For tool-routing issues, enable debug logs:

RUST_LOG=codex_relay=debug codex-relay

The relay logs tool names only, never tool arguments or message content:

  • response tools=... — tools received from Codex's Responses API request
  • upstream tools=... — tools forwarded to the Chat Completions upstream
  • upstream function_calls=... — function calls returned by a blocking upstream response
  • upstream stream function_calls=... — function calls returned by a streaming upstream response

These lines are useful for checking whether a tool such as spawn_agent was preserved by the relay, and whether the failure happened before or after the model selected that tool.

Subagent tool routing

Codex subagent tools such as spawn_agent, wait_agent, and close_agent are runtime tools. The relay can preserve them in the tool schema and round-trip the model's selected function call, but it cannot reliably detect whether the local Codex app-server daemon is new enough to execute those calls.

If Codex shows unsupported call: spawn_agent, first verify that the Codex CLI and app-server daemon versions match. A stale daemon can expose a newer tool schema to the model while lacking the handler that executes the returned call. Also check your Codex config: [features] subagents = true is not recognized; use [features] multi_agent = true only if you need to override the default.

As an escape hatch for affected runtimes, remove unsupported tools before they reach the upstream model:

CODEX_RELAY_TOOL_DENYLIST=spawn_agent,wait_agent,close_agent codex-relay

The denylist matches the tool name forwarded to Chat Completions. Namespaced MCP tools use their flattened name, for example mcp__codex_apps__github_fetch_issue.

Offline (always green, default cargo test)

Replays Codex CLI fixtures through the translation layer and asserts role/tool/reasoning behavior. Each fixture pins a Codex CLI version under tests/fixtures/codex_<major>_<minor>_<patch>/.

cargo test

Live (gated on provider API key, #[ignore] by default)

Spawns the relay binary on a random port, points it at the real provider, and exercises /v1/models, blocking + streaming, tool calls, and (for thinking models) the reasoning_content round-trip via an in-process recording proxy.

DEEPSEEK_API_KEY=sk-... cargo test --test compat_deepseek_live -- --ignored --test-threads=1

Regenerating fixtures after a Codex upgrade

  1. Add a debug dump to the relay (write body bytes from handle_responses to a file before parsing).
  2. Run a real codex exec against it; copy inbound_*.json to a new tests/fixtures/codex_<major>_<minor>_<patch>/ folder.
  3. Trim each payload down to the smallest one that exercises the feature you want to lock in.
  4. Add a row to tests/fixtures/VERSIONS.md and a test pointing at the new directory.

The old fixture directory stays as a regression net so the relay keeps working with the previous Codex CLI release.

Disclaimer

This project is not affiliated with, endorsed by, or sponsored by OpenAI. "Codex" refers to OpenAI Codex CLI, an open-source project licensed under Apache-2.0. codex-relay is an independent, community-built translation proxy.

Contributors

  • myk5010 — system/developer message ordering fix and model name mapping (#4)
  • qcnhy — streaming usage and MCP namespace bug reports plus independent verification (#5, #6)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

codex_relay-0.2.4-py3-none-win_amd64.whl (2.7 MB view details)

Uploaded Python 3Windows x86-64

codex_relay-0.2.4-py3-none-manylinux_2_28_x86_64.whl (3.2 MB view details)

Uploaded Python 3manylinux: glibc 2.28+ x86-64

codex_relay-0.2.4-py3-none-manylinux_2_28_aarch64.whl (3.1 MB view details)

Uploaded Python 3manylinux: glibc 2.28+ ARM64

codex_relay-0.2.4-py3-none-macosx_11_0_arm64.whl (3.0 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

codex_relay-0.2.4-py3-none-macosx_10_12_x86_64.whl (3.0 MB view details)

Uploaded Python 3macOS 10.12+ x86-64

File details

Details for the file codex_relay-0.2.4-py3-none-win_amd64.whl.

File metadata

  • Download URL: codex_relay-0.2.4-py3-none-win_amd64.whl
  • Upload date:
  • Size: 2.7 MB
  • Tags: Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for codex_relay-0.2.4-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 cd0e1a4e2bc90cab73c53e8646ffcf103c27bc72d07e9271aaba0b941f045627
MD5 a74125db6683c17d1fcff72b1b97c421
BLAKE2b-256 c348e1e4cc685d57256ad90c00004c9c2f94859ffdc88beba59e1b9a32479f2b

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.2.4-py3-none-win_amd64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file codex_relay-0.2.4-py3-none-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for codex_relay-0.2.4-py3-none-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cc35de98129e72462eb141ed50029d06ced9cffc30fd4140c7659e0a7b92dd0d
MD5 f3b4e865aa929420d1b81f150818a419
BLAKE2b-256 f7311fb5f11b6270cbb7893257e3b76e0e6249846736f69cc7fe2cf6786ea99b

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.2.4-py3-none-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file codex_relay-0.2.4-py3-none-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for codex_relay-0.2.4-py3-none-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 f5a297f0721224e9dde679ab943e149707f74cae4d23ff39c22f280a87159f38
MD5 3071554bc132ab8005b6ddce0ec8a926
BLAKE2b-256 4ee79e8e5ccf88eb220203bc2f6aa2be8d5eb0b59a594b2948000e8ebdbb55ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.2.4-py3-none-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file codex_relay-0.2.4-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for codex_relay-0.2.4-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1d004a48377ca4187a1ab5ef94a2e81bb47798a22f3271a0925218b03fe52f8d
MD5 ae02f1eaf756085532d9aa95ac544081
BLAKE2b-256 6f15316e89bdcecde0f5334489404c5d7ba3bf169d5b4464bac3bc834fae7788

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.2.4-py3-none-macosx_11_0_arm64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file codex_relay-0.2.4-py3-none-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for codex_relay-0.2.4-py3-none-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e762e66210bb4aef3ece2e9ed2903d33a2f6fe7115a2982008b90035c205b2ce
MD5 8aa5d7eaff03c1c6c378ae26e69ddeab
BLAKE2b-256 513c7b38767766b50ef728a00b93d651f602cd2c916ee11804bd71beba430df0

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.2.4-py3-none-macosx_10_12_x86_64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page