Scrub personal info, secrets, and API keys from Claude Code transcripts before publishing
Project description
🧼 claude-code-scrubber🫧
Scrub API keys, secrets, and personal information from Claude Code transcripts before publishing them to public repos.
Inspired by Simon Willison's claude-code-transcripts — this tool sits between your raw session data and your public GitHub Pages, giving you peace of mind that you're not leaking credentials.
What it catches
| Severity | What's detected |
|---|---|
| 🔴 High | Anthropic, OpenAI, GitHub, AWS, Google, Slack, Stripe, HuggingFace, SendGrid, Twilio, npm, PyPI, Vercel, Supabase, Netlify API keys & tokens. Bearer/Basic auth headers. JWT tokens. SSH private keys. Database connection strings. Generic SECRET/TOKEN/PASSWORD assignments. |
| 🟡 Medium | Email addresses. Private IP addresses (10.x, 192.168.x, 172.16-31.x). |
| 🔵 Low | OS usernames in file paths (/Users/you/, /home/you/). Encoded Claude project paths. Shell prompts with user@host.local. |
Supported formats
- JSONL — Local Claude Code sessions from
~/.claude/projects/ - JSON — Claude Code for web session exports
- HTML — Output from
claude-code-transcripts(preserves HTML structure)
Install
# With pip
pip install claude-code-scrubber
# With uv (no install needed)
uvx claude-code-scrubber scan session.jsonl
# From source
git clone https://github.com/yanndebray/claude-code-scrubber
cd claude-code-scrubber
pip install -e .
Quick start
# 1. Scan first (dry-run) — see what would be scrubbed
claude-code-scrubber scan my-session.jsonl -u $(whoami)
# 2. Scrub and write clean output
claude-code-scrubber scrub my-session.jsonl -u $(whoami)
# → writes my-session.scrubbed.jsonl
# 3. Scrub HTML transcripts into an output directory
claude-code-scrubber scrub transcripts/*.html -u $(whoami) -o clean/
Usage
scan — Dry-run detection
# Scan with verbose output showing every match
claude-code-scrubber scan session.jsonl -u myuser -v
# Scan only high-severity items
claude-code-scrubber scan session.jsonl -s high
# Scan multiple files
claude-code-scrubber scan *.jsonl *.html
Example output:
📄 Scanning session.jsonl ...
🔴 [HIGH] Anthropic API key at line 1.message.content → sk-ant-***…
🔴 [HIGH] AWS access key at line 3.message.content → AKIAIOSFO…
🟡 [MEDIUM] Email address at line 3.message.content → yann.dup…
🔵 [LOW] Home directory path at line 2.message.content → /Users/yan…
🧼 Found 23 item(s) to scrub across 1 file(s):
🔴 High: 19 (API keys, tokens, passwords)
🟡 Medium: 2 (emails, private IPs)
🔵 Low: 2 (usernames in paths)
The scan command exits with code 1 if findings exist — useful in CI.
scrub — Redact and write clean files
# Write to a suffixed file (default: .scrubbed)
claude-code-scrubber scrub session.jsonl -u myuser
# → session.scrubbed.jsonl
# Write to an output directory
claude-code-scrubber scrub session.jsonl -o clean/
# Overwrite originals (careful!)
claude-code-scrubber scrub session.jsonl --in-place
# Custom suffix
claude-code-scrubber scrub session.jsonl --suffix .clean
init — Create a config file
claude-code-scrubber init # creates .claude-code-scrubber.json
claude-code-scrubber init -f toml # creates .claude-code-scrubber.toml
Configuration
Create a .claude-code-scrubber.json (or .toml) in your project root:
{
"username": "yann",
"severity": ["high", "medium", "low"],
"output_suffix": ".scrubbed",
"allowlist": [
"sk-ant-this-is-a-dummy-key-for-docs"
],
"string_replacements": {
"MyCompanyName": "ACME",
"my-secret-project": "project-x"
},
"patterns": [
{
"name": "Internal ticket ID",
"regex": "PROJ-[0-9]{4,}",
"replacement": "PROJ-XXXX",
"severity": "medium"
}
]
}
| Field | Description |
|---|---|
username |
Your OS username, used to redact paths like /Users/you/ |
severity |
Which levels to scrub. Default: all three. |
allowlist |
Strings that should never be redacted (false positive prevention) |
string_replacements |
Exact string → replacement pairs, applied after regex |
patterns |
Extra regex patterns with name, regex, replacement, severity |
output_suffix |
Suffix for output files. Default: .scrubbed |
Config files are auto-discovered by walking up from the current directory.
Workflow: Claude Code → Public GitHub
# 1. Generate HTML transcripts with Simon's tool
claude-code-transcripts local -o transcripts/
# 2. Scrub them
claude-code-scrubber scrub transcripts/*.html -u $(whoami) -o docs/
# 3. Push to GitHub Pages
git add docs/
git commit -m "Add scrubbed transcripts"
git push
CI gate (GitHub Actions)
- name: Check transcripts for secrets
run: |
pip install claude-code-scrubber
claude-code-scrubber scan docs/**/*.html -u runner -s high
How it works
- Pattern matching — A curated set of 30+ regex patterns detect common API key formats, PII, and secrets
- Format-aware parsing — JSONL/JSON files are parsed and scrubbed recursively at the value level (preserving valid JSON structure). HTML is split on tags to avoid breaking markup.
- Layered severity — You control what gets scrubbed. Need to keep email addresses but redact API keys? Use
-s high - Allowlisting — Known-safe strings (like dummy keys in docs) are never touched
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file claude_code_scrubber-0.1.0.tar.gz.
File metadata
- Download URL: claude_code_scrubber-0.1.0.tar.gz
- Upload date:
- Size: 17.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7cf100a4becd78aada72198b6a3c8f979190b5ee8fa845d2df4e95dbca39e911
|
|
| MD5 |
95e48de674aa2c70fcfefc568c4599e2
|
|
| BLAKE2b-256 |
f9b6d3fb0837b9cc94f00c342e0414e97bd6a56aee126365c1d0b4d536fa0306
|
File details
Details for the file claude_code_scrubber-0.1.0-py3-none-any.whl.
File metadata
- Download URL: claude_code_scrubber-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ff21a90826d54bdcf6b52657751cc5b56ca46f158940364ca6abc15a9dc1257
|
|
| MD5 |
eb958977e23924febbabd44d4b6b9e85
|
|
| BLAKE2b-256 |
dda5947f41ac375524f56ae9cac69528035707739662b3baccb88cdfa797d2ce
|