Skip to main content

Find and redact secrets in AI coding agent histories (Claude Code, and more).

Project description

AgentSweep

AgentSweep

Find and redact secrets (API keys, tokens, private keys, DB URLs) that got pasted into your AI coding agent's local history. Runs fully offline — your files never leave your machine.

License: MIT Python 3.11+ Status: Alpha Offline

Status: alpha. Works on Claude Code and OpenAI Codex today. Aider, Cursor, Continue via contributed Source adapters — see CONTRIBUTING.md.

The problem

Claude Code (and every other AI coding CLI) stores your full conversation history as plain-text JSONL on disk — under ~/.claude/projects/ for Claude Code, ~/.codex/sessions/ for OpenAI Codex. Anything you paste — an AWS key, a .env file, a database URL — sits in clear text indefinitely. A typical dev's history accumulates dozens of secrets over months, often without them realizing.

agentsweep scans that history, tells you what leaked, and can redact the secret values in place while preserving the JSONL structure byte-for-byte. It also tells you which keys to rotate, with the right revocation URL for each provider.

Scope of protection: agentsweep itself is fully local and offline — it reads and writes only files on your machine and makes zero network calls. It removes one attack vector: secrets sitting in local history files. It does not affect what your AI provider already received: when you paste a key into Claude Code, Cursor, or any cloud-backed agent, that key already transited the provider's servers before it hit disk. If that concerns you, consider a locally-hosted model (Ollama, LM Studio, OpenCode) where nothing leaves your machine at all — agentsweep pairs especially well with local-model setups.

Install

pip install agentsweep

Requires Python 3.11+. One dependency: rich, for the pipeline terminal UI. Output degrades to plain text automatically when piped, and --json is always styling-free.

Usage

Interactive mode

Run with no arguments in a terminal and you get the full experience — banner, numbered menu, typed confirmations before anything destructive, and one-key undo (restores the .bak backups). Any interactive scan that finds secrets ends with an offer to redact them on the spot (type REDACT to confirm):

agentsweep

Scripting is unaffected: any flag, or a piped/redirected stream, skips the menu entirely and behaves exactly as documented below.

Flags

Scan (read-only, safe):

agentsweep --source claude-code
agentsweep --source codex

Redact in place (creates .bak backups):

agentsweep --fix --allow-production

Point at an arbitrary folder (e.g. a copy of your history):

agentsweep --root /path/to/jsonl-files --fix

Machine-readable output:

agentsweep --json

Corruption-prevention guarantees

A redactor that corrupts your history is strictly worse than the leak it's fixing. agentsweep enforces these invariants on every --fix:

  1. Redaction happens in parsed JSON, not on raw bytes. Secrets are replaced as string values inside the parsed structure, then re-serialized. Structural damage is impossible by construction.
  2. Atomic writes. Every rewrite goes: temp file → fsync()os.replace() over the original. A crash at any instant leaves either the complete old file or the complete new file — never a torn write.
  3. Post-write validation. Before committing, every non-empty line in the new content must parse as JSON, and the line count must match the original. If either check fails, the write aborts and the original is untouched.
  4. .bak backup by default. Refuses to run if a .bak already exists (so prior backups can't be clobbered).
  5. Path containment. Refuses any target that doesn't resolve inside the source's root.
  6. Symlink rejection. Refuses symlinks outright.
  7. mtime window. Refuses files modified in the last 60 seconds (likely an active session). --force overrides.
  8. Running-process check. Refuses if a Claude Code process appears to be running. --force overrides.
  9. Alpha-stage production gate. --fix against the default ~/.claude/projects/ root requires --allow-production until v1.0.
  10. Audit log. Every write appends SHA256 before/after and path to ~/.claude/agentsweep-audit.jsonl.

Recovery

Every redacted file has a sibling *.bak with the original bytes. To undo:

mv session.jsonl.bak session.jsonl

What's detected

189 high-confidence patterns plus a checksum-validated crypto seed-phrase detector — BIP-39 mnemonics (12/15/18/21/24 words; the wallet format behind BTC, ETH, SOL, BNB, ADA, DOGE, LTC, DOT, AVAX and virtually every major chain) and Electrum seeds are confirmed cryptographically (BIP-39 checksum / Electrum version tag), so English prose that happens to use wallet words never false-positives.

The patterns: AWS access keys, GitHub tokens (PAT/OAuth/App/fine-grained), Stripe live/test, OpenAI, Anthropic, Google API, Slack bot/user/webhook, Hugging Face, JWT, PEM private keys, DB URLs with embedded passwords, npm/PyPI/SendGrid/Twilio tokens — plus 167 rules ported from the gitleaks pack covering GitLab, Grafana, HashiCorp Vault/Terraform, DigitalOcean, Shopify, PlanetScale, Databricks, Atlassian, Azure AD, 1Password, Sentry, New Relic, Mailgun, Datadog, Twilio, Twitter/X, Twitch, Yandex, JFrog, Snyk, Mailchimp, curl credentials on the command line, and many more. Patterns are high-precision — false positives are rare, and provider-context rules are keyword-gated so large pastes stay fast.

What's NOT detected

  • Custom/proprietary secrets without a recognizable prefix.
  • Monero seed phrases (25 words from Monero's own wordlist — planned).
  • Unknown tokens that look like arbitrary base64.
  • Secrets split across multiple messages.
  • Anything inside a binary/non-UTF-8 file.

For deeper detection, run gitleaks or trufflehog alongside agentsweep — their rule packs are more exhaustive. agentsweep's value is the agent-history-specific surface, not the detection engine.

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentsweep-0.1.1.tar.gz (78.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentsweep-0.1.1-py3-none-any.whl (59.1 kB view details)

Uploaded Python 3

File details

Details for the file agentsweep-0.1.1.tar.gz.

File metadata

  • Download URL: agentsweep-0.1.1.tar.gz
  • Upload date:
  • Size: 78.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for agentsweep-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3b8516f4907a3c5f5aeb480cc8f7ac9bef9584c93d0c4871350a4020e74cbf66
MD5 26c456d507b3ff35d6433d757d18af2c
BLAKE2b-256 88d7ae3ae1643ee23f619f122300da00254a5c21e80360b439acaf56bbf54acb

See more details on using hashes here.

File details

Details for the file agentsweep-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: agentsweep-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 59.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for agentsweep-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5b1c8645cdc59b86f0a7105c7490398954f60c326f5bfa679584c78619d6a0da
MD5 ac4e5a7a3674add3983a5210325d4d55
BLAKE2b-256 a1505b4b16d7c751602aa53b1d1928571d1be09dd752a2ff81821c586a23140b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page