Security-first self-hosted privacy masking and routing proxy for LLM traffic.

Project description

stronk-gateway

Security-first self-hosted masking and rehydration proxy for large language model traffic.

stronk-gateway sits between callers and upstream providers. It detects sensitive input, applies per-type policy, forwards only masked payloads, and rehydrates model output before returning it to the caller.

This repo now also includes a separate operator control plane:

proxy-api: the caller-facing masking proxy
admin-api: sanitized monitoring API
admin-ui: modern light/dark monitoring workspace
admin-gateway: authenticated reverse-proxy entrypoint for the control plane

Implemented Provider Surfaces

POST /v1/chat/completions
POST /v1/responses
GET /v1/responses websocket upgrade for OpenAI Responses-style realtime clients
POST /anthropic/v1/messages
GET /health

Supported behavior:

non-streaming request masking and response rehydration
streaming SSE rehydration for the supported provider families
downstream OpenAI Responses websocket compatibility for response.create and response.append
real previous_response_id continuations are preserved for upstream /responses requests
local generate:false prewarm stays local and uses memory-only turn-state recovery instead of leaking synthetic response IDs upstream
request-scoped opaque placeholders
config-driven policy per detector type: allow, mask, block, route_local
deny-by-default behavior when upstream egress or local routing is unsafe
sanitized audit-event persistence for the optional control plane

Websocket Deployment Contract

Supported deployment modes:

single-process deployment
multi-process deployment with sticky-session affinity for one websocket turn

Supported reconnect semantics:

any client harness that can speak OpenAI Responses-style JSON text events can use the websocket bridge; the current x-codex-turn-state header name is compatibility carry-over, but the value itself is treated as a generic bounded opaque turn key
x-codex-turn-state is validated locally, kept process-local and memory-only, and never forwarded upstream
the last committed previous_response_id is reused only after a terminal event has been delivered locally
if a socket disconnects before terminal completion, the next reconnect resumes from the last committed state rather than the aborted in-flight turn
same-turn collisions are rejected with deterministic 409 invalid_websocket_turn; they are not serialized
session-cap overflow is rejected; existing sessions are not evicted

Explicitly unsupported in this plan:

non-sticky multi-worker websocket continuity
external shared turn-state storage
generic websocket passthrough

Safety Guarantees

Raw sensitive values are never forwarded upstream when a detection is masked or blocked.
Raw request bodies, response bodies, rehydrated text, headers, provider credentials, and placeholder maps are not persisted by default.
Upstream egress is fixed to configured provider base URLs; the caller cannot choose arbitrary upstream targets.
Secret classes default to block rather than mask.
If policy requires route_local and no local handler exists, the request is rejected instead of falling back upstream.
The admin plane is separate from proxy routes and is disabled by default.
The admin plane now uses first-party session-cookie auth in stronk-gateway-admin, while raw content traces remain behind the private same-origin bridge path.

V1 Privacy Boundary

In-scope privacy actions on caller-controlled, upstream-visible surfaces:

mask instructions
mask developer and system message content
reject top-level tools when sensitive text appears in descriptions, examples, or schema defaults
strip x-codex-turn-metadata
do not forward caller-supplied x-codex-turn-state
preserve authorization, x-api-key, openai-*, anthropic-*, and x-responsesapi-include-timing-metrics as transport/auth headers rather than privacy-scoped text surfaces
preserve payload id and previous_response_id as bounded identifier surfaces
preserve caller x-request-id only when it is already a bounded opaque identifier; otherwise replace it with a proxy-issued opaque request id

Still out of scope in v1:

neutralizing the public x-codex-turn-state header name itself
masking arbitrary provider-defined identifier fields beyond the explicit in-scope surfaces above

Control Plane

The operator surface is intentionally read-only in this phase. Provisioning and materialization happen outside the browser through the active deployment profile (gateway-only or gateway+CLIProxyAPI).

What it shows:

request counts, mask/block/rehydration totals, and error counts
detector and policy action mix
sanitized per-request events with endpoint, model, latency, counts, and touched JSON paths
safe config posture and recent control-plane access logs

What it does not show:

raw request bodies
raw response bodies
rehydrated plaintext
upstream Authorization or X-API-Key headers
placeholder-to-original mappings

Detection Coverage

Deterministic detectors are implemented first and augmented with a local Presidio + spaCy layer by default. The detector interface stays pluggable so richer local NER can still be added later without redesigning the pipeline.

Current detector set:

English and Chinese person names
Company and organization names
English and Chinese addresses
Emails
US and China phone numbers
API keys, including sk-..., sk-proj-..., sk-ant-..., and common provider key prefixes
Bearer tokens
JWT-like tokens

Architecture

src/stronk_gateway/redaction/ - detection, masking, placeholder vault, and structured payload traversal
src/stronk_gateway/policy/ - per-detector policy resolution
src/stronk_gateway/providers/ - fixed provider endpoint specs
src/stronk_gateway/proxy/ - upstream transport, header controls, SSE rehydration, and audit writes
src/stronk_gateway/admin/ - first-party session auth, SQLite-backed sanitized event store, and UI lookup helpers
src/stronk_gateway/app.py - proxy app factory
src/stronk_gateway/admin_app.py - separate admin app factory
web/ - React/Vite operator UI with light and dark mode
compose/ - local proxy + admin + Caddy stack
docs/ - architecture, threat model, and execution plans

Quick Start

Local Python + frontend

python3 -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
python -m spacy download en_core_web_sm
python -m spacy download zh_core_web_sm
make ui-install
make ui-build
make test

Run the proxy:

make run

Run the admin API/UI separately:

STRONK_GATEWAY_AUDIT_STORAGE_ENABLED=true \
STRONK_GATEWAY_AUDIT_DB_PATH=./data/stronk-gateway-audit.sqlite3 \
STRONK_GATEWAY_ENABLE_ADMIN_API=true \
STRONK_GATEWAY_ENABLE_ADMIN_UI=true \
STRONK_GATEWAY_ADMIN_BOOTSTRAP_USER=operator \
STRONK_GATEWAY_ADMIN_BOOTSTRAP_PASSWORD_HASH="$(python - <<'PY'
from stronk_gateway.admin import hash_admin_password
print(hash_admin_password('change-me'))
PY
)" \
STRONK_GATEWAY_ADMIN_UI_DIR=./web/dist \
make admin-run

The admin app now owns first-party web login with a bootstrap username plus scrypt password hash. Running make admin-run is useful for local session/auth/UI work, but content-trace inspection still depends on the same-origin admin gateway path described below because /api/debug/* is bridged to the proxy runtime on a separate private listener.

Local Docker Compose

Choose an operator username:

export STRONK_GATEWAY_ADMIN_USER=operator

Generate an admin scrypt password hash:

export STRONK_GATEWAY_ADMIN_PASSWORD_HASH="$(python - <<'PY'
from stronk_gateway.admin import hash_admin_password
print(hash_admin_password('change-me'))
PY
)"

Start the stack:

docker compose -f compose/docker-compose.yml up --build

Default local endpoints:

proxy: http://127.0.0.1:8787
admin UI: http://127.0.0.1:8788

Default local behavior:

sign in through the admin web UI with STRONK_GATEWAY_ADMIN_USER plus the plaintext password you hashed locally
public :8787 stays inference-only
raw content traces are visible only after authenticated login through the admin gateway on :8788
the trace bridge is internal-only and does not mount raw trace routes on the public proxy listener

Release Publishing

The preferred release path now uses GitHub Actions for both GHCR image publishing and PyPI Trusted Publishing. The recorded release flow, plus the Bitwarden-backed local PyPI fallback, lives in docs/release-publishing.md.

The short version:

cd /Users/eyy/Documents/Work/Dev/repos/stronk-gateway
uv build
BWS_ACCESS_TOKEN="$BWS_STRONK_TERMINAL_ACCESS_TOKEN" \
bws run -- uv publish

Keep GHCR publishing in GitHub Actions; do not add local container-registry tokens to this flow.

Configuration

Core proxy environment variables:

STRONK_GATEWAY_OPENAI_UPSTREAM_BASE_URL
STRONK_GATEWAY_ANTHROPIC_UPSTREAM_BASE_URL
STRONK_GATEWAY_ALLOW_INSECURE_UPSTREAMS=false
STRONK_GATEWAY_ALLOW_NONSTANDARD_UPSTREAM_HOSTS=false
STRONK_GATEWAY_ENABLE_DEBUG_MASK_ENDPOINT=false
STRONK_GATEWAY_PRESIDIO_ENABLED=true
STRONK_GATEWAY_PRESIDIO_ENGLISH_MODEL=en_core_web_sm
STRONK_GATEWAY_PRESIDIO_CHINESE_MODEL=zh_core_web_sm
STRONK_GATEWAY_*_ACTION

Admin plane environment variables:

STRONK_GATEWAY_AUDIT_STORAGE_ENABLED=false
STRONK_GATEWAY_AUDIT_DB_PATH
STRONK_GATEWAY_AUDIT_MAX_EVENTS=2000
STRONK_GATEWAY_ENABLE_ADMIN_API=false
STRONK_GATEWAY_ENABLE_ADMIN_UI=false
STRONK_GATEWAY_ADMIN_AUTH_MODE=session
STRONK_GATEWAY_ADMIN_BOOTSTRAP_USER
STRONK_GATEWAY_ADMIN_BOOTSTRAP_PASSWORD_HASH
STRONK_GATEWAY_ADMIN_BOOTSTRAP_ROLES=admin,operator
STRONK_GATEWAY_ADMIN_TRACE_ALLOWED_ROLES=admin,operator
STRONK_GATEWAY_ADMIN_SESSION_COOKIE_NAME=stronk_admin_session
STRONK_GATEWAY_ADMIN_SESSION_IDLE_TTL_SECONDS=1800
STRONK_GATEWAY_ADMIN_SESSION_ABSOLUTE_TTL_SECONDS=43200
STRONK_GATEWAY_ADMIN_LOGIN_ATTEMPT_LIMIT=5
STRONK_GATEWAY_ADMIN_LOGIN_ATTEMPT_WINDOW_SECONDS=900
STRONK_GATEWAY_UNSAFE_DEBUG_GATEWAY_SECRET
STRONK_GATEWAY_ADMIN_ALLOWED_ROLES=admin,operator,auditor
STRONK_GATEWAY_ADMIN_ACCESS_LOG_MAX_ENTRIES=500
STRONK_GATEWAY_ADMIN_UI_DIR=./web/dist

Default policy:

email, phone, person_name, organization, address -> mask
api_key, bearer_token, jwt -> block

Tests

The suite includes:

unit coverage for detector behavior, overlap resolution, placeholder generation, policy parsing, audit summaries, event-store behavior, and SSE rehydration
integration coverage for all supported provider endpoints
websocket regression coverage for OpenAI Responses response.create, response.append, prewarm, invalid events, and incomplete upstream streams
websocket regression coverage for real previous_response_id continuations, fresh-chain resets, reconnect recovery via x-codex-turn-state, and binary-frame rejection
regression checks proving raw values do not leak into forwarded upstream payloads
admin-plane coverage for 401/403 auth enforcement and sanitized proxy-to-admin event flow
streaming tests for OpenAI chat, OpenAI responses, and Anthropic messages
bypass and collision canaries including zero-width-key variants and literal placeholder collisions

Benchmark Evidence

The local benchmark harness lives at python3 scripts/bench_proxy.py and writes row artifacts under docs/exec-plans/active/privacy-proxy-hardening-and-scalability-v1/artifacts/benchmarks/.

Measured local rows on 2026-03-28:

W1-http-chat: 15.27 requests/s, p95 latency 10649 ms, zero failures, blocked only because no historic pre-change baseline was captured
W2-http-responses: 14.33 requests/s, p95 latency 11030 ms, zero failures, blocked only because no historic pre-change baseline was captured
W3-sse-stream: 14.47 streams/s, p95 first-byte 4863 ms, zero leak counters, blocked only because no historic pre-change baseline was captured
W4-ws-sequential: 80 successful turns at 16 sessions x 5 turns, blocked because the black-box reconnect probe still ended with ConnectionClosedError instead of deterministically proving rollback semantics
W5-ws-abuse: all 24/24 scripted abusive requests were rejected and all 4/4 blocked-budget sockets closed, but the row remains blocked because black-box evidence cannot prove internal pre-parse and pre-redaction ordering
W6-audit-contention: 256.97 requests/s, p95 latency 1577 ms, zero database is locked failures, blocked on the missing historic baseline and because admin access logging means the read side is not a pure-read workload
W7-memory-bound: 9.80 requests/s, 40 truncation hits observed, blocked because the plan never encoded a numeric RSS envelope even though capture limits and truncation markers were exercised

These are local harness measurements, not an SLA. The repo does not claim a p95 regression comparison against pre-change behavior because that historic baseline was not captured before the hardening work.

Stronger Than PasteGuard

This repo now claims stronger behavior only where it is implemented and tested:

Explicit OpenAI responses endpoint coverage, not just chat completions.
OpenAI responses websocket compatibility on the same public path used by Codex-style clients.
Fixed upstream egress targets with request-header allowlists and redirect refusal.
Official upstream host pinning is on by default; non-standard compatible hosts require explicit opt-in.
Default secret handling is block, not best-effort masking.
Raw detection values are not reflected back through the debug path.
Streaming rehydration is covered for all three supported provider surfaces.
Rehydration is limited to human-readable assistant text paths; tool arguments stay masked by default.
Regression tests cover placeholder collisions, bypass attempts, and /responses as a first-class path.
The monitoring plane is separate from proxy routes, disabled by default, and stores sanitized telemetry only.

Threat Model Summary

Caller credentials for upstream providers are part of the data plane and must not be reused for admin authentication.
The admin plane now uses first-party session-cookie auth inside stronk-gateway-admin, backed by a bootstrap username plus scrypt password hash.
Raw content traces stay memory-only in the proxy runtime and are reachable only through the loopback/private admin gateway path plus the shared bridge secret.
The admin gateway should stay loopback-bound or privately networked by default. If you widen it, add real TLS and network controls first.
The SQLite event store persists sanitized events only. It is not a safe place for raw prompts, raw completions, or placeholder vault state.

Known Limitations

Name, organization, and address detection is heuristic. It is materially stronger than regex-only email and key detection, but it is not equivalent to a full local NER model.
Presidio + spaCy are enabled by default in this repo. Fresh environments must install en_core_web_sm and zh_core_web_sm or explicitly disable Presidio.
route_local is a clean policy boundary today, but the local-model execution path is still a scaffold and fails closed.
WebSocket support is intentionally scoped to OpenAI Responses-style JSON text events. stronk-gateway is not a generic websocket tunnel and does not currently support binary or audio frames.
The websocket bridge keeps the upstream side on HTTP plus Server-Sent Events (SSE). It does not yet proxy upstream websocket transports.
The public x-codex-turn-state header name is a compatibility carry-over. Its semantics are harness-neutral, but the header name itself is not yet neutralized.
Websocket continuity still depends on single-process deployment or sticky-session affinity for one turn. Non-sticky multi-worker continuity and shared external turn-state remain out of scope here.
The admin plane is read-only in this phase. There is no browser-based policy editor or request replay workflow.
The admin backend should stay on a private or loopback-bound network. The shared trace-bridge secret is a second trust signal, not a substitute for network boundaries and TLS.
The frontend build currently uses npm-managed assets and should be built as part of image creation or CI before enabling the admin UI.

Project details

Release history Release notifications | RSS feed

0.3.15a1 pre-release

Apr 10, 2026

This version

0.3.14

Apr 8, 2026

0.3.13

Apr 8, 2026

0.3.12

Apr 5, 2026

0.3.11

Apr 5, 2026

0.3.10

Apr 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stronk_gateway-0.3.14.tar.gz (217.1 kB view details)

Uploaded Apr 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stronk_gateway-0.3.14-py3-none-any.whl (76.8 kB view details)

Uploaded Apr 8, 2026 Python 3

File details

Details for the file stronk_gateway-0.3.14.tar.gz.

File metadata

Download URL: stronk_gateway-0.3.14.tar.gz
Upload date: Apr 8, 2026
Size: 217.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stronk_gateway-0.3.14.tar.gz
Algorithm	Hash digest
SHA256	`876c5969c3dab29eb574731828c440ec54c6b899d95553ad7a89978015c06d4a`
MD5	`d9a6391ab847a24649f0975597c512cb`
BLAKE2b-256	`937a57c09985f63de4572b83c95c1d16b0d0cdfdc2d3f3be4a1cc402ac8ef34c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for stronk_gateway-0.3.14.tar.gz:

Publisher: publish-release.yml on EYYCHEEV/stronk-gateway

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: stronk_gateway-0.3.14.tar.gz
- Subject digest: 876c5969c3dab29eb574731828c440ec54c6b899d95553ad7a89978015c06d4a
- Sigstore transparency entry: 1256504057
- Sigstore integration time: Apr 8, 2026
Source repository:
- Permalink: EYYCHEEV/stronk-gateway@6c22d4b6915ac51d93ba1afd4e1e192fb476a460
- Branch / Tag: refs/tags/v0.3.14
- Owner: https://github.com/EYYCHEEV
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-release.yml@6c22d4b6915ac51d93ba1afd4e1e192fb476a460
- Trigger Event: push

File details

Details for the file stronk_gateway-0.3.14-py3-none-any.whl.

File metadata

Download URL: stronk_gateway-0.3.14-py3-none-any.whl
Upload date: Apr 8, 2026
Size: 76.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for stronk_gateway-0.3.14-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c4f72f2723acb506c09c190a23001f9610ef6a9ef594c634756b59bd7972ff84`
MD5	`4a6a94cb60352b20b899537067b0341b`
BLAKE2b-256	`3b3a3998b555ce10f593eeb150ab8b85b3a807f948d0d9ee0063d64a7750e9b0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for stronk_gateway-0.3.14-py3-none-any.whl:

Publisher: publish-release.yml on EYYCHEEV/stronk-gateway

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: stronk_gateway-0.3.14-py3-none-any.whl
- Subject digest: c4f72f2723acb506c09c190a23001f9610ef6a9ef594c634756b59bd7972ff84
- Sigstore transparency entry: 1256504180
- Sigstore integration time: Apr 8, 2026
Source repository:
- Permalink: EYYCHEEV/stronk-gateway@6c22d4b6915ac51d93ba1afd4e1e192fb476a460
- Branch / Tag: refs/tags/v0.3.14
- Owner: https://github.com/EYYCHEEV
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-release.yml@6c22d4b6915ac51d93ba1afd4e1e192fb476a460
- Trigger Event: push

stronk-gateway 0.3.14

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

stronk-gateway

Implemented Provider Surfaces

Websocket Deployment Contract

Safety Guarantees

V1 Privacy Boundary

Control Plane

Detection Coverage

Architecture

Quick Start

Local Python + frontend

Local Docker Compose

Release Publishing

Configuration

Tests

Benchmark Evidence

Stronger Than PasteGuard

Threat Model Summary

Known Limitations

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance