Skip to main content

Scan for secrets in files you plan to share

Project description

scan-for-secrets

PyPI Changelog Tests License

Scan for secrets in files you plan to share

Installation

Install this tool using pip:

pip install scan-for-secrets

Or uv:

uv tool install scan-for-secrets

Or use without installing via uvx:

uvx scan-for-secrets --help

Usage

This tool helps scan all of the text files in a directory (ignoring binary files) to see if they include specified secret strings. For example, run this if you want to publish the logs from a coding agent session after first confirming no secrets from environment variables are exposed in those logs.

Basic usage looks like this:

scan-for-secrets $OPENAI_API_KEY $ANTHROPIC_API_KEY

This will scan text files in the current folder and all sub-folders looking for the values that were passed as positional arguments, including common escaping schemes that might mean a direct string match misses them.

To scan for a secret that can be accessed using another command, use $(command) syntax:

scan-for-secrets "$(llm keys get openai)"

Add -d/--directory to specify a different directory to scan:

scan-for-secrets $OPENAI_API_KEY -d ~/my-project

You can also pipe a list of newline-separated secrets to the tool:

cat secrets.txt | scan-for-secrets

This can be combined with secrets passed as positional arguments.

Output

If no secrets are found, the tool will terminate with an exit code 0 and output nothing. If secrets are found it will return an exit code 1 and list the files, line numbers and the first few characters of each secret that was spotted.

Example output:

logs/2024-03-15.jsonl:42: sk-a... (literal)
logs/2024-03-15.jsonl:108: sk-a... (json)
config/debug.html:7: ghp_... (html)

Configuration file

If you run scan-for-secrets without any extra arguments or piped data the command will look for a default configuration file to tell it what to scan for instead.

This file lives at ~/.scan-for-secrets.conf.sh and contains commands that will be executed to retrieve secrets. Each line should be a shell command that outputs a single secret to stdout (or a blank line or a comment).

# API keys
echo $OPENAI_API_KEY
echo $ANTHROPIC_API_KEY

# AWS (using xargs to strip whitespace)
awk -F= '/aws_secret_access_key/{print $2}' ~/.aws/credentials | xargs

# 1Password
op read "op://Vault/API Key/password"

# LLM keys
llm keys get gemini

Blank lines and lines starting with # are ignored. By default the file is executed with sh. Add a shebang line (e.g. #!/bin/bash or #!/usr/bin/env python3) to use a different interpreter.

With a configuration file setup you can run scan-for-secrets like this:

cd agent-logs/
scan-for-secrets

Or this:

scan-for-secrets -d agent-logs

You can also pass a path to a configuration file using the -c/--config option:

scan-for-secrets -c scan.sh

Unlike the default configuration behavior, this -c option will be combined with any piped data or additional positional arguments.

Using this as a Python library

This package can also be used as a Python library. Add scan-for-secrets as a dependency and use it like this:

from scan_for_secrets import scan_directory

result = scan_directory("./logs", ["sk-abc123...", "ghp_secret..."])

if result.has_secrets:
    for match in result.matches:
        print(f"{match.file_path}:{match.line_number}: {match.secret_hint} ({match.encoding})")

API reference

scan_directory(directory: str | Path, secrets: list[str]) -> ScanResult

Recursively scans all text files in directory for the given secrets, checking both literal matches and common escaped variants (JSON, URL percent-encoding, HTML entities, backslash-doubled, Unicode escapes and Python repr).

  • directory: Root directory to scan. Can be a string path or a pathlib.Path.
  • secrets: List of secret strings to search for. Empty strings are ignored.

Binary files (detected by null bytes in the first 8192 bytes) are skipped. The following directories are also skipped: .git, .hg, .svn, node_modules, __pycache__, .venv, venv.

ScanResult

@dataclass
class ScanResult:
    matches: list[Match]  # All matches found across all files
    files_scanned: int    # Number of text files checked

    @property
    def has_secrets(self) -> bool:
        """True if any matches were found."""

Match

@dataclass
class Match:
    file_path: str     # Path relative to the scanned directory
    line_number: int   # 1-based line number where the match was found
    secret_hint: str   # First 4 characters of the original secret + "..."
    encoding: str      # How the secret was encoded: "literal", "json", "url",
                       # "html", "backslash-doubled", "unicode-escape", or "repr"

Help

For help, run:

scan-for-secrets --help

You can also use:

python -m scan_for_secrets --help

Development

To contribute to this tool, first checkout the code. Then run the tests:

cd scan-for-secrets
uv run pytest

To run the development version of the command itself:

uv run scan-for-secrets --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scan_for_secrets-0.1.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scan_for_secrets-0.1-py3-none-any.whl (12.4 kB view details)

Uploaded Python 3

File details

Details for the file scan_for_secrets-0.1.tar.gz.

File metadata

  • Download URL: scan_for_secrets-0.1.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scan_for_secrets-0.1.tar.gz
Algorithm Hash digest
SHA256 cf6592595f7a8a30194c4394f5238f3fe5c3ad6a0269f0830d253c490814ef1c
MD5 1588318e47131e2d46db8a81b61e0337
BLAKE2b-256 8a560f855106965690c4fe3a2f6cea99a9dc3594dc1547b32066091e0eb4002a

See more details on using hashes here.

Provenance

The following attestation bundles were made for scan_for_secrets-0.1.tar.gz:

Publisher: publish.yml on simonw/scan-for-secrets

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scan_for_secrets-0.1-py3-none-any.whl.

File metadata

  • Download URL: scan_for_secrets-0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for scan_for_secrets-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 794fb60dee14010cffa8c884768cc6a165237543a8dccb03a5a913d42946f84e
MD5 138f49b2780724b66dab69858d3f664a
BLAKE2b-256 3348bf99d2396b6ac88599ac3c038757a9387a14429f805ebef4cff3f54ce11d

See more details on using hashes here.

Provenance

The following attestation bundles were made for scan_for_secrets-0.1-py3-none-any.whl:

Publisher: publish.yml on simonw/scan-for-secrets

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page