Entropy-based secret scanner for source code — detects API keys, tokens, passwords, and other sensitive data leaks
Project description
entro-scan
Entropy-based secret scanner for source code — detects API keys, tokens, passwords, and other sensitive data leaks before they reach production.
Features
- Shannon entropy analysis — finds high-entropy strings that look like secrets
- Known pattern detection — regex matching for JWT, AWS keys, GitHub tokens, private keys, DB URLs, and more
- Git history scanning — scan commit history for accidentally committed secrets
- Multiple output formats — terminal (colorized), JSON, CSV, SARIF (GitHub code scanning compatible)
- Fast parallel scanning — multi-process worker pool for large codebases
- Low false-positive rate — smart filters reject common strings, hex dumps, and natural language
- Configurable — TOML-based config with custom thresholds, excludes, and file types
- Zero external dependencies — pure Python 3.11+, uses only the standard library
- Pre-commit hook ready — catch secrets before they're committed
Installation
pip install entro-scan
Or install from source:
git clone https://github.com/vyofgod/entro-scan.git
cd entro-scan
pip install -e .
Usage
# Scan current directory
entro-scan
# Scan a specific path
entro-scan /path/to/project
# Custom entropy threshold (lower = more findings)
entro-scan /path --threshold 4.0
# Output in JSON format
entro-scan /path --format json
# Output to file
entro-scan /path --format json -o results.json
# Scan git history (last 100 commits)
entro-scan /path --git
# Scan git history with custom depth
entro-scan /path --git --max-commits 500
# Parallel scan with 8 workers
entro-scan /path --workers 8
# Quiet mode (only findings, no banner)
entro-scan /path --quiet
# Generate a default config file
entro-scan --init
Output Formats
Terminal (default)
Color-coded output with severity levels:
- Red (score > 4.5): Critical — likely a secret
- Yellow (score > 3.9): High — suspicious
- Green (score <= 3.9): Medium — low-confidence finding
JSON
Machine-readable output for CI/CD pipelines.
CSV
Spreadsheet-friendly output for reporting.
SARIF
Static Analysis Results Interchange Format — compatible with GitHub code scanning.
Configuration
Create .entro-scan.toml in your project root:
threshold = 3.5
workers = 4
quiet = false
verbose = false
output_format = "terminal"
git_enabled = false
max_commits = 100
exclude_dirs = [
".git", "node_modules", "venv", "__pycache__",
".idea", ".vscode", "build", "dist", "target",
]
exclude_files = [
"package-lock.json", "yarn.lock", "pnpm-lock.yaml",
"cargo.lock", "go.sum",
]
include_extensions = [
".py", ".rs", ".js", ".ts", ".go", ".java", ".kt", ".swift",
".rb", ".php", ".sh", ".json", ".yaml", ".yml", ".toml", ".env",
]
Alternatively, config can live under [tool.entro-scan] in your pyproject.toml.
Supported Patterns
| Pattern | Severity |
|---|---|
| JWT (JSON Web Tokens) | Critical |
| AWS Access Key ID | Critical |
| AWS Secret Key | Critical |
| GitHub Token | Critical |
| Slack Token | Critical |
| Private Keys (RSA/DSA/EC/OpenSSH) | Critical |
| GitLab Token | High |
| Heroku API Key | High |
| Database URLs (Postgres, MySQL, MongoDB, Redis) | High |
| Generic API Keys / Secrets | Medium |
Pre-commit Hook
Add to your .pre-commit-config.yaml:
repos:
- repo: https://github.com/vyofgod/entro-scan
rev: v1.0.0
hooks:
- id: entro-scan
CI/CD Integration
GitHub Actions
See .github/workflows/ci.yml for a complete example that runs entro-scan on every push and PR.
Exit Codes
0: No secrets found (or scan completed successfully)1: Error (config issue, path not found)
Development
# Install dev dependencies
pip install pytest ruff
# Run tests
pytest tests/ -v
# Lint
ruff check .
# Type check
mypy entro_scan/
Why entropy?
Secrets like API keys, tokens, and passwords are typically random strings with high entropy (information density). Natural language text and code identifiers have much lower entropy. By measuring the Shannon entropy of strings in your codebase, entro-scan can flag potential secrets with high accuracy.
License
MIT — see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file entro_scan-1.0.0.tar.gz.
File metadata
- Download URL: entro_scan-1.0.0.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b5d073426e799c645f262447376a44bd808b3a957db905f7d748da2502f8744f
|
|
| MD5 |
7bac00c213bfacabf1b23d23bad44c39
|
|
| BLAKE2b-256 |
b60a2294847e7f88fab635ff97e50a3b31631c6bf4e0a356d63fe7c5c3a817c1
|
File details
Details for the file entro_scan-1.0.0-py3-none-any.whl.
File metadata
- Download URL: entro_scan-1.0.0-py3-none-any.whl
- Upload date:
- Size: 14.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64df8d22b6c6953cc3fe8f9f78dcac002ef25bfb0812cb111b65d4c5853dcf76
|
|
| MD5 |
9179e9e27581484dc5ee56642038bc1f
|
|
| BLAKE2b-256 |
1f832a138317cedf69250ceb01293f141a08b69624e246517c466e81fd12d94c
|