Responses API ↔ Chat Completions translation bridge for Codex CLI

These details have not been verified by PyPI

Project description

codex-relay

A lightweight Rust proxy that translates the OpenAI Responses API (used by Codex CLI) into the Chat Completions API, letting Codex work with any OpenAI-compatible provider — DeepSeek, Kimi, Qwen, Mistral, Groq, xAI, OpenRouter, and more.

Why

Codex CLI speaks the OpenAI Responses API, which is an OpenAI-proprietary stateful protocol. Every other provider exposes the standard Chat Completions API. codex-relay sits between Codex and your chosen provider, translating on the fly — no code changes to Codex required.

Install

# From PyPI — prebuilt binary for your platform
pip install codex-relay

# From crates.io
cargo install codex-relay

Quick start

1. Start the relay

CODEX_RELAY_UPSTREAM=https://api.deepseek.com/v1 \
CODEX_RELAY_API_KEY=$DEEPSEEK_API_KEY \
CODEX_RELAY_PORT=4446 \
codex-relay

On startup, the relay logs the available upstream models and prints a hint:

ℹ upstream models: deepseek-chat, deepseek-reasoner
⚠  To configure Codex with model metadata, run:  codex-relay --print-config --upstream ...

2. Generate your Codex config

codex-relay --print-config \
  --upstream https://api.deepseek.com/v1 \
  --api-key $DEEPSEEK_API_KEY

This prints a ready-to-use ~/.codex/config.toml snippet that includes model_properties for every upstream model, so Codex knows model capabilities and you won't see the "Model metadata … not found" warning.

If you prefer to write the config by hand, here is the minimal form:

model = "deepseek-chat"
model_provider = "deepseek-relay"

[model_providers.deepseek-relay]
name = "DeepSeek"
base_url = "http://127.0.0.1:4446/v1"
wire_api = "responses"
env_key = "DEEPSEEK_API_KEY"

[model_properties."deepseek-chat"]
context_window = 262144
max_context_window = 1048576
supports_parallel_tool_calls = true
supports_reasoning_summaries = false
input_modalities = ["text"]

⚠️ Without model_properties, Codex CLI defaults to fallback metadata for any model it doesn't recognize natively. This can degrade performance, tool-call reliability, and context-window management. The relay logs a reminder at startup and offers --print-config to eliminate this class of problem entirely.

3. Use Codex normally — it routes through the relay transparently.

CLI reference

Flag	Env var	Default	Description
`--port`	`CODEX_RELAY_PORT`	`4444`	Listen port
`--upstream`	`CODEX_RELAY_UPSTREAM`	`https://openrouter.ai/api/v1`	Upstream Chat Completions base URL
`--api-key`	`CODEX_RELAY_API_KEY`	(empty)	API key forwarded to upstream
`--upstream-extra-params`	`CODEX_RELAY_UPSTREAM_EXTRA_PARAMS`	(empty)	JSON object merged into each upstream Chat Completions request
`--drop-upstream-params`	`CODEX_RELAY_DROP_PARAMS`	(empty)	JSON array of top-level upstream request parameters to remove
`--model-map`	`CODEX_RELAY_MODEL_MAP`	(empty)	Comma-separated `source:target` model name translations
`--print-config`	(none)	—	Print a Codex config snippet with `model_properties` and exit
`--session-ttl-hours`	`CODEX_RELAY_SESSION_TTL_HOURS`	`168`	Retain idle `previous_response_id` history and reasoning state for this many hours
`--max-sessions`	`CODEX_RELAY_MAX_SESSIONS`	`256`	Maximum completed response histories retained for continuation
`--max-session-memory-mb`	`CODEX_RELAY_MAX_SESSION_MEMORY_MB`	`512`	Approximate memory budget for retained session/reasoning state

Supported providers

Provider	Base URL	Suggested port
DeepSeek	`https://api.deepseek.com/v1`	4446
Kimi (Moonshot)	`https://api.moonshot.cn/v1`	4447
GLM (Zhipu)	`https://open.bigmodel.cn/api/coding/paas/v4`	4453
Qwen	`https://dashscope.aliyuncs.com/compatible-mode/v1`	4448
Mistral	`https://api.mistral.ai/v1`	4449
Groq	`https://api.groq.com/openai/v1`	4450
xAI	`https://api.x.ai/v1`	4451
OpenRouter	`https://openrouter.ai/api/v1`	4452

Any OpenAI-compatible endpoint works.

Upstream request parameters

Some providers expose non-standard Chat Completions parameters. You can merge top-level JSON fields into every upstream request, and optionally drop generated top-level fields before the merge. For example, to disable DeepSeek V4 thinking/reasoning mode:

CODEX_RELAY_UPSTREAM_EXTRA_PARAMS='{"thinking":{"type":"disabled"}}' \
CODEX_RELAY_DROP_PARAMS='["reasoning_effort"]' \
codex-relay --upstream https://api.deepseek.com/v1 --api-key "$DEEPSEEK_API_KEY"

Features

Streaming — full SSE streaming with correct event sequencing
Tool calls — accumulates streaming deltas and emits structured function_call items
Parallel tool calls — consecutive function_call input items merged into one assistant message
Reasoning models — streams reasoning_content (or the reasoning alias) as Responses reasoning summaries and preserves it across turns (Kimi k2.6, DeepSeek-R1, GLM). For GLM/Zhipu models the relay automatically sends thinking: {"type": "enabled"}, since GLM otherwise suppresses reasoning under Codex's system prompt
Model catalog — proxies /v1/models from the upstream provider
Auto-config — --print-config generates a complete Codex config with model metadata

Configuration

Variable	Default	Description
`CODEX_RELAY_PORT`	`4444`	Port to listen on
`CODEX_RELAY_UPSTREAM`	`https://openrouter.ai/api/v1`	Upstream Chat Completions base URL
`CODEX_RELAY_API_KEY`	(empty)	API key forwarded to upstream
`CODEX_RELAY_UPSTREAM_EXTRA_PARAMS`	(empty)	JSON object merged into each upstream Chat Completions request body
`CODEX_RELAY_DROP_PARAMS`	(empty)	JSON array of top-level upstream request parameter names to remove before forwarding
`CODEX_RELAY_MODEL_MAP`	(empty)	Comma-separated `source:target` model name translations (e.g., `gpt-5.4:deepseek-v4-pro`)
`CODEX_RELAY_TOOL_DENYLIST`	(empty)	Comma-separated tool names to remove before forwarding tools to the upstream model
`CODEX_RELAY_SESSION_TTL_HOURS`	`168`	Retain idle session/reasoning state for this many hours
`CODEX_RELAY_MAX_SESSIONS`	`256`	Maximum completed response histories retained for `previous_response_id`
`CODEX_RELAY_MAX_SESSION_MEMORY_MB`	`512`	Approximate memory budget for retained session/reasoning state
`CODEX_RELAY_HISTORY_STORE`	`memory`	Retained history backend: `memory` or `disk`
`CODEX_RELAY_HISTORY_DIR`	`.codex-relay-history`	Directory for disk-backed history records
`RUST_LOG`	`codex_relay=info`	Log verbosity

Python API

from codex_relay import start

proc = start(port=4446, upstream="https://api.deepseek.com/v1", api_key="sk-...")
# ... use Codex ...
proc.terminate()

Testing

Two layers — offline tests pin behavior against captured Codex wire-shape; live tests pin behavior against real provider APIs.

Debugging tool round-trips

For tool-routing issues, enable debug logs:

RUST_LOG=codex_relay=debug codex-relay

The relay logs tool names only, never tool arguments or message content:

response tools=... — tools received from Codex's Responses API request
upstream tools=... — tools forwarded to the Chat Completions upstream
upstream function_calls=... — function calls returned by a blocking upstream response
upstream stream function_calls=... — function calls returned by a streaming upstream response

These lines are useful for checking whether a tool such as spawn_agent was preserved by the relay, and whether the failure happened before or after the model selected that tool.

Disk-backed history

By default, codex-relay keeps retained previous_response_id histories and reasoning lookups in memory. For longer-running processes or deeper debugging, you can opt into an inspectable on-disk store:

CODEX_RELAY_HISTORY_STORE=disk \
CODEX_RELAY_HISTORY_DIR=.codex-relay-history \
codex-relay

The disk backend writes JSON records under:

.codex-relay-history/
  sessions/
  reasoning/
  turns/

Session records contain the translated Chat Completions messages retained for a response id. Reasoning records keep call-id and turn-fingerprint lookups used to round-trip provider reasoning content. The relay keeps only an in-memory index for disk-backed entries and loads payloads on demand.

Treat this directory as sensitive: records may contain prompts, tool outputs, and other conversation data. The same TTL/count/byte retention knobs apply to disk-backed records, and evicted entries are removed from disk.

Subagent tool routing

Codex subagent tools such as spawn_agent, wait_agent, and close_agent are runtime tools. The relay can preserve them in the tool schema and round-trip the model's selected function call, but it cannot reliably detect whether the local Codex app-server daemon is new enough to execute those calls.

If Codex shows unsupported call: spawn_agent, first verify that the Codex CLI and app-server daemon versions match. A stale daemon can expose a newer tool schema to the model while lacking the handler that executes the returned call. Also check your Codex config: [features] subagents = true is not recognized; use [features] multi_agent = true only if you need to override the default.

As an escape hatch for affected runtimes, remove unsupported tools before they reach the upstream model:

CODEX_RELAY_TOOL_DENYLIST=spawn_agent,wait_agent,close_agent codex-relay

The denylist matches the tool name forwarded to Chat Completions. Namespaced MCP tools use their flattened name, for example mcp__codex_apps__github-_fetch_issue.

Offline (always green, default cargo test)

Replays Codex CLI fixtures through the translation layer and asserts role/tool/reasoning behavior. Each fixture pins a Codex CLI version under tests/fixtures/codex_<major>_<minor>_<patch>/.

cargo test

Live (gated on provider API key, #[ignore] by default)

Spawns the relay binary on a random port, points it at the real provider, and exercises /v1/models, blocking + streaming, tool calls, and (for thinking models) the reasoning_content round-trip via an in-process recording proxy.

DEEPSEEK_API_KEY=sk-... cargo test --test compat_deepseek_live -- --ignored --test-threads=1

Regenerating fixtures after a Codex upgrade

Add a debug dump to the relay (write body bytes from handle_responses to a file before parsing).
Run a real codex exec against it; copy inbound_*.json to a new tests/fixtures/codex_<major>_<minor>_<patch>/ folder.
Trim each payload down to the smallest one that exercises the feature you want to lock in.
Add a row to tests/fixtures/VERSIONS.md and a test pointing at the new directory.

The old fixture directory stays as a regression net so the relay keeps working with the previous Codex CLI release.

Disclaimer

This project is not affiliated with, endorsed by, or sponsored by OpenAI. "Codex" refers to OpenAI Codex CLI, an open-source project licensed under Apache-2.0. codex-relay is an independent, community-built translation proxy.

Contributors

myk5010 — system/developer message ordering fix and model name mapping (#4)
qcnhy — streaming usage, MCP namespace bug reports, namespace tool-routing analysis, and independent verification (#5, #6, #17)
JasonC93 — subagent tool-routing and spawned-agent context isolation reports (#10, #12)
ma-buting — namespace tool-name separator fix (#19)
SaladDay — prompt-cache accounting debug logs (#22)
Cherno76 — prompt-cache hit tokens in Responses API usage (#23)

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.1

Jul 2, 2026

This version

0.4.0

Jul 2, 2026

0.3.6

Jun 27, 2026

0.3.5

Jun 27, 2026

0.3.4

Jun 26, 2026

0.3.3

Jun 8, 2026

0.3.2

Jun 6, 2026

0.3.1

Jun 4, 2026

0.3.0

Jun 2, 2026

0.2.4

Jun 1, 2026

0.2.3

May 29, 2026

0.2.2

May 28, 2026

0.2.1

May 24, 2026

0.2.0

May 14, 2026

0.1.9

May 11, 2026

0.1.8

May 7, 2026

0.1.7

May 7, 2026

0.1.6

May 7, 2026

0.1.5

May 7, 2026

0.1.4

Apr 24, 2026

0.1.3

Apr 24, 2026

0.1.2

Apr 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codex_relay-0.4.0-py3-none-win_amd64.whl (2.8 MB view details)

Uploaded Jul 2, 2026 Python 3Windows x86-64

codex_relay-0.4.0-py3-none-manylinux_2_28_x86_64.whl (3.3 MB view details)

Uploaded Jul 2, 2026 Python 3manylinux: glibc 2.28+ x86-64

codex_relay-0.4.0-py3-none-manylinux_2_28_aarch64.whl (3.2 MB view details)

Uploaded Jul 2, 2026 Python 3manylinux: glibc 2.28+ ARM64

codex_relay-0.4.0-py3-none-macosx_11_0_arm64.whl (3.0 MB view details)

Uploaded Jul 2, 2026 Python 3macOS 11.0+ ARM64

codex_relay-0.4.0-py3-none-macosx_10_12_x86_64.whl (3.1 MB view details)

Uploaded Jul 2, 2026 Python 3macOS 10.12+ x86-64

File details

Details for the file codex_relay-0.4.0-py3-none-win_amd64.whl.

File metadata

Download URL: codex_relay-0.4.0-py3-none-win_amd64.whl
Upload date: Jul 2, 2026
Size: 2.8 MB
Tags: Python 3, Windows x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for codex_relay-0.4.0-py3-none-win_amd64.whl
Algorithm	Hash digest
SHA256	`ebb42c4d08c8eec963252efe157bd0ef888a610ae2854c0b2f3a0f2188fc00d0`
MD5	`97ac35396e722ba320f0f76d72300a0e`
BLAKE2b-256	`2a6f89427c2b7be743c9d6bca8a29b12b9aadbeae8c85ef69971996016b4eb7c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.4.0-py3-none-win_amd64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codex_relay-0.4.0-py3-none-win_amd64.whl
- Subject digest: ebb42c4d08c8eec963252efe157bd0ef888a610ae2854c0b2f3a0f2188fc00d0
- Sigstore transparency entry: 2044278172
- Sigstore integration time: Jul 2, 2026
Source repository:
- Permalink: MetaFARS/codex-relay@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/MetaFARS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Trigger Event: push

File details

Details for the file codex_relay-0.4.0-py3-none-manylinux_2_28_x86_64.whl.

File metadata

Download URL: codex_relay-0.4.0-py3-none-manylinux_2_28_x86_64.whl
Upload date: Jul 2, 2026
Size: 3.3 MB
Tags: Python 3, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for codex_relay-0.4.0-py3-none-manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`85c446430a62a53129223d7f3832748489442f445853f03e157f8200a0de1b12`
MD5	`0bbaee90365dd730aaae0abac9de5055`
BLAKE2b-256	`9d1da99a0fcdf45c75fc578ceb91b83a2d69c4c361a0866c52e5e05c7835dc95`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.4.0-py3-none-manylinux_2_28_x86_64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codex_relay-0.4.0-py3-none-manylinux_2_28_x86_64.whl
- Subject digest: 85c446430a62a53129223d7f3832748489442f445853f03e157f8200a0de1b12
- Sigstore transparency entry: 2044278129
- Sigstore integration time: Jul 2, 2026
Source repository:
- Permalink: MetaFARS/codex-relay@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/MetaFARS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Trigger Event: push

File details

Details for the file codex_relay-0.4.0-py3-none-manylinux_2_28_aarch64.whl.

File metadata

Download URL: codex_relay-0.4.0-py3-none-manylinux_2_28_aarch64.whl
Upload date: Jul 2, 2026
Size: 3.2 MB
Tags: Python 3, manylinux: glibc 2.28+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for codex_relay-0.4.0-py3-none-manylinux_2_28_aarch64.whl
Algorithm	Hash digest
SHA256	`21414bcee2f718a52af1f42b4fd64fb9b4e741d46aab8df04ccfaa544e21a0d7`
MD5	`e6aa214c5094515288f2319082c6a248`
BLAKE2b-256	`e8b12cc57206dc9718af90c55ba1d4c942294b54f5de20c5b9d135bff134dcda`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.4.0-py3-none-manylinux_2_28_aarch64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codex_relay-0.4.0-py3-none-manylinux_2_28_aarch64.whl
- Subject digest: 21414bcee2f718a52af1f42b4fd64fb9b4e741d46aab8df04ccfaa544e21a0d7
- Sigstore transparency entry: 2044278148
- Sigstore integration time: Jul 2, 2026
Source repository:
- Permalink: MetaFARS/codex-relay@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/MetaFARS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Trigger Event: push

File details

Details for the file codex_relay-0.4.0-py3-none-macosx_11_0_arm64.whl.

File metadata

Download URL: codex_relay-0.4.0-py3-none-macosx_11_0_arm64.whl
Upload date: Jul 2, 2026
Size: 3.0 MB
Tags: Python 3, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for codex_relay-0.4.0-py3-none-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`27c1e02fca6176f5b281da3d3abd2f95c9948e56271b2c0f3b33d5d750fdbf3d`
MD5	`27cd0b3adbe25b7245ca07d11a56635d`
BLAKE2b-256	`5b6c1a4daeec305c2e76779a1e4d6156ac6ff1c90c0951e0b2e186c8d8ccc1ab`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.4.0-py3-none-macosx_11_0_arm64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codex_relay-0.4.0-py3-none-macosx_11_0_arm64.whl
- Subject digest: 27c1e02fca6176f5b281da3d3abd2f95c9948e56271b2c0f3b33d5d750fdbf3d
- Sigstore transparency entry: 2044278104
- Sigstore integration time: Jul 2, 2026
Source repository:
- Permalink: MetaFARS/codex-relay@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/MetaFARS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Trigger Event: push

File details

Details for the file codex_relay-0.4.0-py3-none-macosx_10_12_x86_64.whl.

File metadata

Download URL: codex_relay-0.4.0-py3-none-macosx_10_12_x86_64.whl
Upload date: Jul 2, 2026
Size: 3.1 MB
Tags: Python 3, macOS 10.12+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for codex_relay-0.4.0-py3-none-macosx_10_12_x86_64.whl
Algorithm	Hash digest
SHA256	`e0e7524ef0450bdb4293138a701ee32eb49db3a480b1cc3a62dbfb3b15bfafc4`
MD5	`c679fa2e1491ef20173c26701c8d344f`
BLAKE2b-256	`597f6535f53177eda79af707a793eed7765a14de890e7b08776657572cc239a6`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_relay-0.4.0-py3-none-macosx_10_12_x86_64.whl:

Publisher: publish.yml on MetaFARS/codex-relay

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codex_relay-0.4.0-py3-none-macosx_10_12_x86_64.whl
- Subject digest: e0e7524ef0450bdb4293138a701ee32eb49db3a480b1cc3a62dbfb3b15bfafc4
- Sigstore transparency entry: 2044278073
- Sigstore integration time: Jul 2, 2026
Source repository:
- Permalink: MetaFARS/codex-relay@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Branch / Tag: refs/tags/v0.4.0
- Owner: https://github.com/MetaFARS
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@5cf9a13e2e84fe3fb73ab8dcb10e2c947f0635b5
- Trigger Event: push

codex-relay 0.4.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

codex-relay

Why

Install

Quick start

CLI reference

Supported providers

Upstream request parameters

Features

Configuration

Python API

Testing

Debugging tool round-trips

Disk-backed history

Subagent tool routing

Disclaimer

Contributors

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance