Redact PII and secrets from AI prompts, traces and tool-call arguments before they reach your loggers.
Project description
traceredact
Redact PII and secrets from AI prompts, agent traces and tool-call arguments before they reach your loggers / observability backend.
LLM apps log everything — prompts, agent traces, tool-call arguments — into
Langfuse / Helicone / Datadog / your own DB. Customer PII and API keys leak into
those traces. traceredact is a small, dependency-light library that detects
and redacts that data deterministically, in-process, before it leaves you.
It is content-based: it catches a sk-… key or a credit-card number even
when it sits under an innocuous JSON key — not just well-known field names.
A missed secret is a real incident, so detection is treated as safety-critical: bounded (ReDoS-safe) patterns, entropy fallback, Luhn/IBAN validation, and adversarial evasion fixtures.
Install
pip install traceredact # or: uv add traceredact
Usage (3 lines)
from traceredact import redact
result = redact({"args": {"email": "a@b.com", "key": "sk-1234567890abcdefABCDEFGH"}})
print(result.value) # {'args': {'email': '[REDACTED:pii]', 'key': '[REDACTED:secret]'}}
print(result.findings) # [Finding(detector_id='pii.email', json_path='args.email', ...), ...]
redact() accepts a string, dict, list, or any nested mix. The input is never
mutated; result.value is a redacted copy and result.findings lists every hit
with its detector_id, category, confidence, json_path and span.
CLI (CI-gateable)
traceredact scan ./logs/ # report findings as a table; exit 1 if any
traceredact scan trace.json -f json # machine-readable output for CI
traceredact redact trace.json -o redacted.json
scan exits non-zero when anything is found, so you can gate a CI job on it.
SDK integrations
from openai import OpenAI
from traceredact.integrations.openai import wrap_openai
client = wrap_openai(OpenAI()) # prompts + completions now redacted in-flight
Also: traceredact.integrations.anthropic.wrap_anthropic(client) and
traceredact.integrations.langchain.RedactingCallbackHandler().
Async clients are supported via wrap_async_openai / wrap_async_anthropic.
Streaming
Redact a stream of text deltas without buffering the whole response — a secret spanning chunk boundaries is still caught (carry-over window):
from traceredact import redact_stream
for piece in redact_stream(token_deltas): # also: redact_stream_async(...)
log(piece)
# OpenAI async streams:
from traceredact.integrations.openai import redact_content_stream
async for safe_text in redact_content_stream(await client.chat.completions.create(..., stream=True)):
...
Structured objects
pydantic models, dataclasses and attrs instances are traversed automatically
(redacted to dicts). Disable with Policy(traverse_objects=False).
Encoded payloads (opt-in)
Policy(decode_payloads=True) base64-decodes blobs one layer and, if the decoded
text contains a high-confidence secret, redacts the whole blob.
Policy file (traceredact.yml)
Drop a traceredact.yml in your repo root (auto-discovered) or pass --policy:
entropy_threshold: 4.0
min_entropy_len: 20
disabled_detectors:
- pii.phone
allowlist:
- "noreply@example.com"
allow_patterns:
- ".*@example\\.com"
placeholder: "[REDACTED:{category}]"
hash_correlation: false # set true + hash_key to emit correlation tags
custom_patterns:
- id: custom.internal_user_id
category: pii
regex: "ACME-USR-[0-9]{8}"
confidence: 0.95
See traceredact.yml in this repo for a fully-commented example.
Detectors
Secrets: secrets.openai_key, secrets.anthropic_key,
secrets.aws_access_key, secrets.github_token, secrets.slack_token,
secrets.slack_webhook, secrets.discord_webhook, secrets.google_api_key,
secrets.stripe_key, secrets.sendgrid_key, secrets.twilio_key,
secrets.huggingface_token, secrets.npm_token, secrets.pypi_token,
secrets.azure_storage_key, secrets.jwt, secrets.private_key,
secrets.pgp_private_key, secrets.basic_auth_url, secrets.bearer_token,
secrets.env_assignment, secrets.high_entropy.
PII: pii.email, pii.credit_card (Luhn), pii.iban (mod-97),
pii.ipv4, pii.phone, pii.us_ssn.
Secret pattern hits are deterministic (confidence 1.0); fuzzy heuristics
(entropy, phone, IP) carry lower confidence so policy thresholds can gate them.
Design & safety
- Deterministic, no data retained. Pure functions; nothing is stored.
- Copy, never mutate. Your objects are untouched.
- ReDoS-safe. Cheap literal prefilters gate bounded regexes; no nested quantifiers; input length is capped.
- Fail-closed. Hash correlation without a key, or exceeding
max_depth, raises rather than silently leaking.
Detectors were hardened against adversarial evasion cases (see
tests/test_evasion.py).
License
Apache-2.0.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file traceredact-0.2.1.tar.gz.
File metadata
- Download URL: traceredact-0.2.1.tar.gz
- Upload date:
- Size: 112.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a49b61ba70aab2f113c66bc2b4214989d1af4d7aefc7177508756a461ef51c7a
|
|
| MD5 |
401cd21c2d3b0ef13bda8b85302d91d6
|
|
| BLAKE2b-256 |
4d135fa2ce79dda4a1815fc3eef91ec193ccc9465c15c7d2587abc30eb65f844
|
Provenance
The following attestation bundles were made for traceredact-0.2.1.tar.gz:
Publisher:
release.yml on traceredact/traceredact
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
traceredact-0.2.1.tar.gz -
Subject digest:
a49b61ba70aab2f113c66bc2b4214989d1af4d7aefc7177508756a461ef51c7a - Sigstore transparency entry: 1744998761
- Sigstore integration time:
-
Permalink:
traceredact/traceredact@a8418f164573ed9563a9b8de24d9a1789e21bba5 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/traceredact
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a8418f164573ed9563a9b8de24d9a1789e21bba5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file traceredact-0.2.1-py3-none-any.whl.
File metadata
- Download URL: traceredact-0.2.1-py3-none-any.whl
- Upload date:
- Size: 33.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39e6e97fca8b3cdfc9fcefd2ddd9022e94c39873731fcb8362d8f51f49a75871
|
|
| MD5 |
4339edf908388973ea5bc64d4c44f70c
|
|
| BLAKE2b-256 |
828f5409a97e84d411ee3544f7d8cf3c9d2e0d66317e704f022b75fcc34b5c46
|
Provenance
The following attestation bundles were made for traceredact-0.2.1-py3-none-any.whl:
Publisher:
release.yml on traceredact/traceredact
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
traceredact-0.2.1-py3-none-any.whl -
Subject digest:
39e6e97fca8b3cdfc9fcefd2ddd9022e94c39873731fcb8362d8f51f49a75871 - Sigstore transparency entry: 1744998944
- Sigstore integration time:
-
Permalink:
traceredact/traceredact@a8418f164573ed9563a9b8de24d9a1789e21bba5 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/traceredact
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a8418f164573ed9563a9b8de24d9a1789e21bba5 -
Trigger Event:
push
-
Statement type: