Security-first self-hosted privacy masking and routing proxy for LLM traffic.
Project description
stronk-gateway
Security-first self-hosted masking and rehydration proxy for large language model traffic.
stronk-gateway sits between callers and upstream providers. It detects sensitive input, applies per-type policy, forwards only masked payloads, and rehydrates model output before returning it to the caller.
This repo now also includes a separate operator control plane:
proxy-api: the caller-facing masking proxyadmin-api: sanitized monitoring APIadmin-ui: modern light/dark monitoring workspaceadmin-gateway: authenticated reverse-proxy entrypoint for the control plane
Implemented Provider Surfaces
POST /v1/chat/completionsPOST /v1/responsesGET /v1/responseswebsocket upgrade for OpenAI Responses-style realtime clientsPOST /anthropic/v1/messagesGET /health
Supported behavior:
- non-streaming request masking and response rehydration
- streaming SSE rehydration for the supported provider families
- downstream OpenAI Responses websocket compatibility for
response.createandresponse.append - real
previous_response_idcontinuations are preserved for upstream/responsesrequests - local
generate:falseprewarm stays local and uses memory-only turn-state recovery instead of leaking synthetic response IDs upstream - request-scoped opaque placeholders
- config-driven policy per detector type:
allow,mask,block,route_local - deny-by-default behavior when upstream egress or local routing is unsafe
- sanitized audit-event persistence for the optional control plane
Websocket Deployment Contract
Supported deployment modes:
- single-process deployment
- multi-process deployment with sticky-session affinity for one websocket turn
Supported reconnect semantics:
- any client harness that can speak OpenAI Responses-style JSON text events can use the websocket bridge; the current
x-codex-turn-stateheader name is compatibility carry-over, but the value itself is treated as a generic bounded opaque turn key x-codex-turn-stateis validated locally, kept process-local and memory-only, and never forwarded upstream- the last committed
previous_response_idis reused only after a terminal event has been delivered locally - if a socket disconnects before terminal completion, the next reconnect resumes from the last committed state rather than the aborted in-flight turn
- same-turn collisions are rejected with deterministic
409 invalid_websocket_turn; they are not serialized - session-cap overflow is rejected; existing sessions are not evicted
Explicitly unsupported in this plan:
- non-sticky multi-worker websocket continuity
- external shared turn-state storage
- generic websocket passthrough
Safety Guarantees
- Raw sensitive values are never forwarded upstream when a detection is masked or blocked.
- Raw request bodies, response bodies, rehydrated text, headers, provider credentials, and placeholder maps are not persisted by default.
- Upstream egress is fixed to configured provider base URLs; the caller cannot choose arbitrary upstream targets.
- Secret classes default to
blockrather thanmask. - If policy requires
route_localand no local handler exists, the request is rejected instead of falling back upstream. - The admin plane is separate from proxy routes and is disabled by default.
- The admin plane now uses first-party session-cookie auth in
stronk-gateway-admin, while raw content traces remain behind the private same-origin bridge path.
V1 Privacy Boundary
In-scope privacy actions on caller-controlled, upstream-visible surfaces:
- mask
instructions - mask
developerandsystemmessage content - reject top-level
toolswhen sensitive text appears in descriptions, examples, or schema defaults - strip
x-codex-turn-metadata - do not forward caller-supplied
x-codex-turn-state - preserve
authorization,x-api-key,openai-*,anthropic-*, andx-responsesapi-include-timing-metricsas transport/auth headers rather than privacy-scoped text surfaces - preserve payload
idandprevious_response_idas bounded identifier surfaces - preserve caller
x-request-idonly when it is already a bounded opaque identifier; otherwise replace it with a proxy-issued opaque request id
Still out of scope in v1:
- neutralizing the public
x-codex-turn-stateheader name itself - masking arbitrary provider-defined identifier fields beyond the explicit in-scope surfaces above
Control Plane
The operator surface is intentionally read-only in this phase.
Provisioning and materialization happen outside the browser through the active deployment profile (gateway-only or gateway+CLIProxyAPI).
What it shows:
- request counts, mask/block/rehydration totals, and error counts
- detector and policy action mix
- sanitized per-request events with endpoint, model, latency, counts, and touched JSON paths
- safe config posture and recent control-plane access logs
What it does not show:
- raw request bodies
- raw response bodies
- rehydrated plaintext
- upstream
AuthorizationorX-API-Keyheaders - placeholder-to-original mappings
Detection Coverage
Deterministic detectors are implemented first and augmented with a local Presidio + spaCy layer by default. The detector interface stays pluggable so richer local NER can still be added later without redesigning the pipeline.
Current detector set:
- English and Chinese person names
- Company and organization names
- English and Chinese addresses
- Emails
- US and China phone numbers
- API keys, including
sk-...,sk-proj-...,sk-ant-..., and common provider key prefixes - Bearer tokens
- JWT-like tokens
Architecture
src/stronk_gateway/redaction/- detection, masking, placeholder vault, and structured payload traversalsrc/stronk_gateway/policy/- per-detector policy resolutionsrc/stronk_gateway/providers/- fixed provider endpoint specssrc/stronk_gateway/proxy/- upstream transport, header controls, SSE rehydration, and audit writessrc/stronk_gateway/admin/- first-party session auth, SQLite-backed sanitized event store, and UI lookup helperssrc/stronk_gateway/app.py- proxy app factorysrc/stronk_gateway/admin_app.py- separate admin app factoryweb/- React/Vite operator UI with light and dark modecompose/- local proxy + admin + Caddy stackdocs/- architecture, threat model, and execution plans
Quick Start
Local Python + frontend
python3 -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
python -m spacy download en_core_web_sm
python -m spacy download zh_core_web_sm
make ui-install
make ui-build
make test
Run the proxy:
make run
Run the admin API/UI separately:
STRONK_GATEWAY_AUDIT_STORAGE_ENABLED=true \
STRONK_GATEWAY_AUDIT_DB_PATH=./data/stronk-gateway-audit.sqlite3 \
STRONK_GATEWAY_ENABLE_ADMIN_API=true \
STRONK_GATEWAY_ENABLE_ADMIN_UI=true \
STRONK_GATEWAY_ADMIN_BOOTSTRAP_USER=operator \
STRONK_GATEWAY_ADMIN_BOOTSTRAP_PASSWORD_HASH="$(python - <<'PY'
from stronk_gateway.admin import hash_admin_password
print(hash_admin_password('change-me'))
PY
)" \
STRONK_GATEWAY_ADMIN_UI_DIR=./web/dist \
make admin-run
The admin app now owns first-party web login with a bootstrap username plus scrypt password hash. Running make admin-run is useful for local session/auth/UI work, but content-trace inspection still depends on the same-origin admin gateway path described below because /api/debug/* is bridged to the proxy runtime on a separate private listener.
Local Docker Compose
- Choose an operator username:
export STRONK_GATEWAY_ADMIN_USER=operator
- Generate an admin scrypt password hash:
export STRONK_GATEWAY_ADMIN_PASSWORD_HASH="$(python - <<'PY'
from stronk_gateway.admin import hash_admin_password
print(hash_admin_password('change-me'))
PY
)"
- Start the stack:
docker compose -f compose/docker-compose.yml up --build
Default local endpoints:
- proxy:
http://127.0.0.1:8787 - admin UI:
http://127.0.0.1:8788
Default local behavior:
- sign in through the admin web UI with
STRONK_GATEWAY_ADMIN_USERplus the plaintext password you hashed locally - public
:8787stays inference-only - raw content traces are visible only after authenticated login through the admin gateway on
:8788 - the trace bridge is internal-only and does not mount raw trace routes on the public proxy listener
Release Publishing
The preferred release path now uses GitHub Actions for both GHCR image publishing and PyPI Trusted Publishing. The recorded release flow, plus the Bitwarden-backed local PyPI fallback, lives in docs/release-publishing.md.
The short version:
cd /Users/eyy/Documents/Work/Dev/repos/stronk-gateway
uv build
BWS_ACCESS_TOKEN="$BWS_STRONK_TERMINAL_ACCESS_TOKEN" \
bws run -- uv publish
Keep GHCR publishing in GitHub Actions; do not add local container-registry tokens to this flow.
Configuration
Core proxy environment variables:
STRONK_GATEWAY_OPENAI_UPSTREAM_BASE_URLSTRONK_GATEWAY_ANTHROPIC_UPSTREAM_BASE_URLSTRONK_GATEWAY_ALLOW_INSECURE_UPSTREAMS=falseSTRONK_GATEWAY_ALLOW_NONSTANDARD_UPSTREAM_HOSTS=falseSTRONK_GATEWAY_ENABLE_DEBUG_MASK_ENDPOINT=falseSTRONK_GATEWAY_PRESIDIO_ENABLED=trueSTRONK_GATEWAY_PRESIDIO_ENGLISH_MODEL=en_core_web_smSTRONK_GATEWAY_PRESIDIO_CHINESE_MODEL=zh_core_web_smSTRONK_GATEWAY_*_ACTION
Admin plane environment variables:
STRONK_GATEWAY_AUDIT_STORAGE_ENABLED=falseSTRONK_GATEWAY_AUDIT_DB_PATHSTRONK_GATEWAY_AUDIT_MAX_EVENTS=2000STRONK_GATEWAY_ENABLE_ADMIN_API=falseSTRONK_GATEWAY_ENABLE_ADMIN_UI=falseSTRONK_GATEWAY_ADMIN_AUTH_MODE=sessionSTRONK_GATEWAY_ADMIN_BOOTSTRAP_USERSTRONK_GATEWAY_ADMIN_BOOTSTRAP_PASSWORD_HASHSTRONK_GATEWAY_ADMIN_BOOTSTRAP_ROLES=admin,operatorSTRONK_GATEWAY_ADMIN_TRACE_ALLOWED_ROLES=admin,operatorSTRONK_GATEWAY_ADMIN_SESSION_COOKIE_NAME=stronk_admin_sessionSTRONK_GATEWAY_ADMIN_SESSION_IDLE_TTL_SECONDS=1800STRONK_GATEWAY_ADMIN_SESSION_ABSOLUTE_TTL_SECONDS=43200STRONK_GATEWAY_ADMIN_LOGIN_ATTEMPT_LIMIT=5STRONK_GATEWAY_ADMIN_LOGIN_ATTEMPT_WINDOW_SECONDS=900STRONK_GATEWAY_UNSAFE_DEBUG_GATEWAY_SECRETSTRONK_GATEWAY_ADMIN_ALLOWED_ROLES=admin,operator,auditorSTRONK_GATEWAY_ADMIN_ACCESS_LOG_MAX_ENTRIES=500STRONK_GATEWAY_ADMIN_UI_DIR=./web/dist
Default policy:
email,phone,person_name,organization,address->maskapi_key,bearer_token,jwt->block
Tests
The suite includes:
- unit coverage for detector behavior, overlap resolution, placeholder generation, policy parsing, audit summaries, event-store behavior, and SSE rehydration
- integration coverage for all supported provider endpoints
- websocket regression coverage for OpenAI Responses
response.create,response.append, prewarm, invalid events, and incomplete upstream streams - websocket regression coverage for real
previous_response_idcontinuations, fresh-chain resets, reconnect recovery viax-codex-turn-state, and binary-frame rejection - regression checks proving raw values do not leak into forwarded upstream payloads
- admin-plane coverage for
401/403auth enforcement and sanitized proxy-to-admin event flow - streaming tests for OpenAI chat, OpenAI responses, and Anthropic messages
- bypass and collision canaries including zero-width-key variants and literal placeholder collisions
Benchmark Evidence
The local benchmark harness lives at python3 scripts/bench_proxy.py and writes row artifacts under docs/exec-plans/active/privacy-proxy-hardening-and-scalability-v1/artifacts/benchmarks/.
Measured local rows on 2026-03-28:
W1-http-chat:15.27requests/s, p95 latency10649 ms, zero failures, blocked only because no historic pre-change baseline was capturedW2-http-responses:14.33requests/s, p95 latency11030 ms, zero failures, blocked only because no historic pre-change baseline was capturedW3-sse-stream:14.47streams/s, p95 first-byte4863 ms, zero leak counters, blocked only because no historic pre-change baseline was capturedW4-ws-sequential:80successful turns at16sessions x5turns, blocked because the black-box reconnect probe still ended withConnectionClosedErrorinstead of deterministically proving rollback semanticsW5-ws-abuse: all24/24scripted abusive requests were rejected and all4/4blocked-budget sockets closed, but the row remains blocked because black-box evidence cannot prove internal pre-parse and pre-redaction orderingW6-audit-contention:256.97requests/s, p95 latency1577 ms, zerodatabase is lockedfailures, blocked on the missing historic baseline and because admin access logging means the read side is not a pure-read workloadW7-memory-bound:9.80requests/s,40truncation hits observed, blocked because the plan never encoded a numeric RSS envelope even though capture limits and truncation markers were exercised
These are local harness measurements, not an SLA. The repo does not claim a p95 regression comparison against pre-change behavior because that historic baseline was not captured before the hardening work.
Stronger Than PasteGuard
This repo now claims stronger behavior only where it is implemented and tested:
- Explicit OpenAI
responsesendpoint coverage, not just chat completions. - OpenAI
responseswebsocket compatibility on the same public path used by Codex-style clients. - Fixed upstream egress targets with request-header allowlists and redirect refusal.
- Official upstream host pinning is on by default; non-standard compatible hosts require explicit opt-in.
- Default secret handling is
block, not best-effort masking. - Raw detection values are not reflected back through the debug path.
- Streaming rehydration is covered for all three supported provider surfaces.
- Rehydration is limited to human-readable assistant text paths; tool arguments stay masked by default.
- Regression tests cover placeholder collisions, bypass attempts, and
/responsesas a first-class path. - The monitoring plane is separate from proxy routes, disabled by default, and stores sanitized telemetry only.
Threat Model Summary
- Caller credentials for upstream providers are part of the data plane and must not be reused for admin authentication.
- The admin plane now uses first-party session-cookie auth inside
stronk-gateway-admin, backed by a bootstrap username plus scrypt password hash. - Raw content traces stay memory-only in the proxy runtime and are reachable only through the loopback/private admin gateway path plus the shared bridge secret.
- The admin gateway should stay loopback-bound or privately networked by default. If you widen it, add real TLS and network controls first.
- The SQLite event store persists sanitized events only. It is not a safe place for raw prompts, raw completions, or placeholder vault state.
Known Limitations
- Name, organization, and address detection is heuristic. It is materially stronger than regex-only email and key detection, but it is not equivalent to a full local NER model.
- Presidio + spaCy are enabled by default in this repo. Fresh environments must install
en_core_web_smandzh_core_web_smor explicitly disable Presidio. route_localis a clean policy boundary today, but the local-model execution path is still a scaffold and fails closed.- WebSocket support is intentionally scoped to OpenAI Responses-style JSON text events.
stronk-gatewayis not a generic websocket tunnel and does not currently support binary or audio frames. - The websocket bridge keeps the upstream side on HTTP plus Server-Sent Events (SSE). It does not yet proxy upstream websocket transports.
- The public
x-codex-turn-stateheader name is a compatibility carry-over. Its semantics are harness-neutral, but the header name itself is not yet neutralized. - Websocket continuity still depends on single-process deployment or sticky-session affinity for one turn. Non-sticky multi-worker continuity and shared external turn-state remain out of scope here.
- The admin plane is read-only in this phase. There is no browser-based policy editor or request replay workflow.
- The admin backend should stay on a private or loopback-bound network. The shared trace-bridge secret is a second trust signal, not a substitute for network boundaries and TLS.
- The frontend build currently uses npm-managed assets and should be built as part of image creation or CI before enabling the admin UI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file stronk_gateway-0.3.14.tar.gz.
File metadata
- Download URL: stronk_gateway-0.3.14.tar.gz
- Upload date:
- Size: 217.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
876c5969c3dab29eb574731828c440ec54c6b899d95553ad7a89978015c06d4a
|
|
| MD5 |
d9a6391ab847a24649f0975597c512cb
|
|
| BLAKE2b-256 |
937a57c09985f63de4572b83c95c1d16b0d0cdfdc2d3f3be4a1cc402ac8ef34c
|
Provenance
The following attestation bundles were made for stronk_gateway-0.3.14.tar.gz:
Publisher:
publish-release.yml on EYYCHEEV/stronk-gateway
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
stronk_gateway-0.3.14.tar.gz -
Subject digest:
876c5969c3dab29eb574731828c440ec54c6b899d95553ad7a89978015c06d4a - Sigstore transparency entry: 1256504057
- Sigstore integration time:
-
Permalink:
EYYCHEEV/stronk-gateway@6c22d4b6915ac51d93ba1afd4e1e192fb476a460 -
Branch / Tag:
refs/tags/v0.3.14 - Owner: https://github.com/EYYCHEEV
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-release.yml@6c22d4b6915ac51d93ba1afd4e1e192fb476a460 -
Trigger Event:
push
-
Statement type:
File details
Details for the file stronk_gateway-0.3.14-py3-none-any.whl.
File metadata
- Download URL: stronk_gateway-0.3.14-py3-none-any.whl
- Upload date:
- Size: 76.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4f72f2723acb506c09c190a23001f9610ef6a9ef594c634756b59bd7972ff84
|
|
| MD5 |
4a6a94cb60352b20b899537067b0341b
|
|
| BLAKE2b-256 |
3b3a3998b555ce10f593eeb150ab8b85b3a807f948d0d9ee0063d64a7750e9b0
|
Provenance
The following attestation bundles were made for stronk_gateway-0.3.14-py3-none-any.whl:
Publisher:
publish-release.yml on EYYCHEEV/stronk-gateway
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
stronk_gateway-0.3.14-py3-none-any.whl -
Subject digest:
c4f72f2723acb506c09c190a23001f9610ef6a9ef594c634756b59bd7972ff84 - Sigstore transparency entry: 1256504180
- Sigstore integration time:
-
Permalink:
EYYCHEEV/stronk-gateway@6c22d4b6915ac51d93ba1afd4e1e192fb476a460 -
Branch / Tag:
refs/tags/v0.3.14 - Owner: https://github.com/EYYCHEEV
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-release.yml@6c22d4b6915ac51d93ba1afd4e1e192fb476a460 -
Trigger Event:
push
-
Statement type: