Local Responses proxy for OpenAI Codex CLI: folds gpt-5.5 518n-2 reasoning truncation (516 degradation) via the official openai_base_url wiring — no provider change, WebSocket-first, no fallback noise.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

dzshzx

These details have not been verified by PyPI

Project description

codex-516-guard

English · 简体中文

A tiny local Responses proxy for the OpenAI Codex CLI that cures the gpt-5.5 "516" reasoning-truncation degradation — while leaving your model_provider untouched, so session grouping, remote compaction and remote-control keep working.

uv tool install codex-516-guard      # install
codex-516-guard                      # run (127.0.0.1:8787)
# then add one line to ~/.codex/config.toml:  openai_base_url = "http://127.0.0.1:8787/v1"

Credits. The detection-and-continue idea comes from neteroster/CodexCont (MIT) — thank you. This project is an independent, from-scratch implementation that keeps the built-in provider intact; see Differences.

The problem: gpt-5.5 "516" degradation

On the OpenAI Codex CLI, gpt-5.5's reasoning sometimes gets cut short at a very specific token count — reasoning_tokens == 518 * n − 2 (i.e. 516, 1034, 1552, …). When a turn lands on that fingerprint, the model stops thinking early and the answer quality drops sharply. It is an upstream issue with no official fix (openai/codex#30364).

codex-516-guard sits on 127.0.0.1 between Codex and the upstream Responses API. When it sees a turn truncate on the 518n−2 fingerprint, it makes the model keep thinking and folds the extra rounds into a single downstream response, so Codex sees one clean, complete answer.

How it works

The proxy streams every upstream round and runs a small state machine (guard/fold.py):

Detect. At the end of each round it reads usage.output_tokens_details.reasoning_tokens. If it equals 518n − 2 (with 1 ≤ n ≤ 6, and at most 3 continuation rounds), the round was truncated.
Continue. It discards that round's tentative output (the message / tool calls — they were produced on truncated thinking), then replays the round's reasoning items (including encrypted_content) plus a single phase:"commentary" assistant message ("Continue thinking...") as the next round's input. That nudges the model to resume reasoning where it left off.
Fold. Reasoning is streamed live to Codex the whole time; only the clean final round's output is flushed. The terminal event is rebuilt as if the whole thing were one response — input/cached come from round 1 (so it never looks like a blown context window), reasoning is summed, and the true cumulative cost is recorded under metadata.proxy_billed_usage.

Wiring: why the built-in provider stays intact

Codex is pointed at the proxy with one top-level config key, not a new provider:

# ~/.codex/config.toml  (top level, before the first [table])
openai_base_url = "http://127.0.0.1:8787/v1"

openai_base_url overrides the base URL of the built-in openai provider in place. This is the officially supported key (openai/codex#16719; the same-name [model_providers.openai] override is rejected by the maintainers, and the OPENAI_BASE_URL env var was removed). Because the provider id stays openai:

your conversation history is not re-bucketed/hidden by provider,
remote compaction keeps working (supports_remote_compaction stays true),
remote-control is unaffected (it uses the separate chatgpt_base_url).

Differences from CodexCont

The 518n−2 detection + fold-continuation mechanism is CodexCont's idea; the implementation here is new and diverges on a few deliberate points:

	codex-516-guard	CodexCont
Codex wiring	top-level `openai_base_url` (built-in provider unchanged)	a new `[model_providers]` entry (history hidden per-provider, remote-control unusable, remote compaction lost)
Downstream transport	WebSocket-first — full `responses_websockets` protocol, plus SSE fallback	SSE only (Codex tries ws → 405 → ~5 reconnect warnings per session, then falls back)
zstd request bodies (0.142.x built-in provider)	decompressed natively, no Codex config change	needs `[features] enable_request_compression = false`
`GET /v1/models` (model-catalog refresh)	passed through (`/v1/*`)	not proxied (silently fails, relies on cache)
Continuation	commentary method only	commentary + legacy tool-pair + cross-turn repair, more knobs

Install

Requires uv (which manages Python for you) and the Codex CLI (ChatGPT OAuth login; tested on 0.142.x).

uv tool install codex-516-guard          # from PyPI
# or straight from source:
# uv tool install git+https://github.com/dzshzx/codex-516-guard

uv puts the executable in its bin dir (~/.local/bin on Unix/macOS; on Windows run where.exe codex-516-guard; uv tool update-shell adds it to PATH). Then:

codex-516-guard                          # run in foreground (default 127.0.0.1:8787)
codex-516-guard --port 8790 --log-level debug

Wire Codex to it (one line in ~/.codex/config.toml, see above), and you're done. Disable by commenting out the openai_base_url line and stopping the proxy. (If the key stays but the proxy is down, Codex errors on an unreachable upstream.)

Upgrade / uninstall: uv tool upgrade codex-516-guard / uv tool uninstall codex-516-guard.

Ports

The proxy's port must match the port in Codex's openai_base_url. If the default port (8787) is busy, the proxy exits with a clear message rather than drifting — a wired proxy that silently binds another port would just be unreachable. To use a different port, pass --port N and set openai_base_url to the same N.

--auto-port is for interactive one-off runs only: on a conflict it scans for the next free port and prints which openai_base_url to use. Don't use it for a wired service.

Autostart (optional, off by default)

Installing registers no autostart — it's entirely your choice.

codex-516-guard install-service     # register + start (current platform)
codex-516-guard uninstall-service   # remove

install-service picks the per-user, runs-in-your-session mechanism (a system service runs in a session with no user environment and can't reach the uv executable or your proxy settings under your profile):

Linux / WSL → a systemd user unit (~/.config/systemd/user/). Run loginctl enable-linger once to start it at boot without logging in. Manual equivalent: see systemd/codex-516-guard.service.example.
macOS → a launchd LaunchAgent in ~/Library/LaunchAgents/ (starts at login, in your GUI session). Load with launchctl bootstrap gui/$(id -u) <plist> / launchctl kickstart -k …; remove with launchctl bootout ….
Windows → prints manual steps, registers nothing (see below).

Windows autostart is manual — on purpose

A program that writes an autostart entry (Startup VBS / Run key / scheduled task) and launches a hidden process trips behavioral antivirus as trojan-like persistence — Kaspersky's proactive-defense module flags the launching python.exe as PDM:Trojan.Win32.Generic. A user-created Startup shortcut is trusted by the same AV.

So this package ships a windowless launcher, codex-516-guardw (a Windows GUI-subsystem exe — no console window at logon), and install-service just tells you how to point a shortcut at it:

Win+R → shell:startup (opens the Startup folder).
New → Shortcut → target = the path from where.exe codex-516-guardw (append --port N if you use a custom port).

Delete the shortcut to disable it.

Mirrored-networking shortcut (WSL ↔ Windows)

If your WSL2 uses networkingMode=mirrored, Windows and WSL share 127.0.0.1. Then you only need one proxy on either side — run it in WSL (as a systemd service), and on the Windows side just add the openai_base_url line to ~/.codex/config.toml pointing at the same 127.0.0.1:8787. No second proxy or Windows autostart needed (the only cost is that Windows Codex depends on the WSL proxy being up).

Verify

curl -sS http://127.0.0.1:8787/healthz            # {"ok":true,...}
journalctl --user -u codex-516-guard -f | grep -E 'round|done'   # Linux/WSL

A live fold looks like this (two chained 516s beaten, answer correct):

round 1: in=21550 out=664 reason=516 total=22214 | n=1 buffered=['function_call'] -> continue
round 2: in=22078 out=652 reason=516 total=22730 | n=1 buffered=['function_call'] -> continue
round 3: in=22606 out=566 reason=291 total=23172 | n=None buffered=[...] -> clean
done: 3 round(s) | ... | status=completed stop=natural

Develop

git clone https://github.com/dzshzx/codex-516-guard && cd codex-516-guard
uv sync
uv run python test_fold.py        # fold state-machine self-test → ALL PASS
uv run codex-516-guard            # run locally

Releases go out via PyPI Trusted Publishing (.github/workflows/release.yml, OIDC, no stored token): push a v* tag and it builds + publishes automatically.

Layout:

guard/fold.py — fingerprint detection + fold state machine (transport-agnostic; covered by test_fold.py).
guard/server.py — starlette transport: ws / SSE downstream, SSE upstream, zstd/gzip request decompression, /v1/* passthrough.
guard/cli.py — CLI entry (codex-516-guard; loopback only; auth passthrough, stores no credentials).

Security & disclaimer

The proxy is auth passthrough only: it forwards Codex's Authorization header and never reads, stores, or logs any credential.
It listens on the loopback address only — do not expose it on a non-loopback interface.
Unofficial: it depends on upstream behavior that isn't a public contract (the truncation fingerprint, the ws frame format). An OpenAI-side change may break it. Use at your own risk.
Continuation spends extra real tokens (see metadata.proxy_billed_usage); the guard bounds this with an n window and a 3-round cap.

Community

Built for and shared with the LINUX DO community, where the gpt-5.5 "516" degradation was diagnosed and discussed. Feedback and issues welcome there and on GitHub Issues.

License

MIT. Fully open source, no closed parts.

Mechanism credit: neteroster/CodexCont (MIT) — this project reuses its 518n−2 detect-and-continue idea with an independent, from-scratch implementation, and keeps the built-in provider intact (see Differences). CodexCont's MIT copyright notice is retained in LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

dzshzx

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.8

Jul 3, 2026

0.2.7

Jul 3, 2026

0.2.6

Jul 3, 2026

0.2.5

Jul 3, 2026

0.2.4

Jul 3, 2026

0.2.3

Jul 3, 2026

0.2.2

Jul 3, 2026

0.2.1

Jul 3, 2026

0.2.0

Jul 3, 2026

0.1.0

Jul 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codex_516_guard-0.2.8.tar.gz (56.4 kB view details)

Uploaded Jul 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

codex_516_guard-0.2.8-py3-none-any.whl (20.4 kB view details)

Uploaded Jul 3, 2026 Python 3

File details

Details for the file codex_516_guard-0.2.8.tar.gz.

File metadata

Download URL: codex_516_guard-0.2.8.tar.gz
Upload date: Jul 3, 2026
Size: 56.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for codex_516_guard-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`71e34ffd95e0cf14751794bd8d0f163cdba31a4698f61192316dae3794f3cda9`
MD5	`1fc204f138f3bb7c29f4889f80d09cb2`
BLAKE2b-256	`149ab6127adcc785b7725337b6fb60b2e3d6fa1c8e2879bf382e58b2a56a6c1d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_516_guard-0.2.8.tar.gz:

Publisher: release.yml on dzshzx/codex-516-guard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codex_516_guard-0.2.8.tar.gz
- Subject digest: 71e34ffd95e0cf14751794bd8d0f163cdba31a4698f61192316dae3794f3cda9
- Sigstore transparency entry: 2063865883
- Sigstore integration time: Jul 3, 2026
Source repository:
- Permalink: dzshzx/codex-516-guard@c872b441d08322ab2a97e9322ecea47a0c883dec
- Branch / Tag: refs/tags/v0.2.8
- Owner: https://github.com/dzshzx
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c872b441d08322ab2a97e9322ecea47a0c883dec
- Trigger Event: push

File details

Details for the file codex_516_guard-0.2.8-py3-none-any.whl.

File metadata

Download URL: codex_516_guard-0.2.8-py3-none-any.whl
Upload date: Jul 3, 2026
Size: 20.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for codex_516_guard-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a7e6fd736ea647037ffb7b18124cec7d073708f9986ad0168f699a87c37c8760`
MD5	`f9896c4813939b04741bfd87759fe63d`
BLAKE2b-256	`6f8377dad1be58c38fdd295f2512a06dd5cfb849a09881bad1520876ce8fe939`

See more details on using hashes here.

Provenance

The following attestation bundles were made for codex_516_guard-0.2.8-py3-none-any.whl:

Publisher: release.yml on dzshzx/codex-516-guard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: codex_516_guard-0.2.8-py3-none-any.whl
- Subject digest: a7e6fd736ea647037ffb7b18124cec7d073708f9986ad0168f699a87c37c8760
- Sigstore transparency entry: 2063865923
- Sigstore integration time: Jul 3, 2026
Source repository:
- Permalink: dzshzx/codex-516-guard@c872b441d08322ab2a97e9322ecea47a0c883dec
- Branch / Tag: refs/tags/v0.2.8
- Owner: https://github.com/dzshzx
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@c872b441d08322ab2a97e9322ecea47a0c883dec
- Trigger Event: push

codex-516-guard 0.2.8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

codex-516-guard

The problem: gpt-5.5 "516" degradation

How it works

Wiring: why the built-in provider stays intact

Differences from CodexCont

Install

Ports

Autostart (optional, off by default)

Windows autostart is manual — on purpose

Mirrored-networking shortcut (WSL ↔ Windows)

Verify

Develop

Security & disclaimer

Community

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance