Entropy-based secret scanner for source code — detects API keys, tokens, passwords, and other sensitive data leaks

These details have not been verified by PyPI

Project links

Project description

entro-scan

Entropy-based secret scanner for source code — detects API keys, tokens, passwords, and other sensitive data leaks before they reach production.

Demo

entro-scan demo

Features

AI-triaged scanning — local qwen2.5 LLM classifies each finding as REAL or NOISE; generates pattern-specific remediation advice (rotate, restrict, move to env var, etc.)
Shannon entropy analysis — finds high-entropy strings that look like secrets
60+ provider patterns — AWS (incl. session tokens), GitHub (classic, fine-grained, OAuth, app install), GitLab, Slack (bot + webhook), Discord (bot + webhook), Telegram bot, Stripe (live, test, publishable, webhook), GCP service-account JSON, Azure Storage / SAS / client secret, OpenAI (sk-/sk-proj-/sk-svcacct-), Anthropic (sk-ant-), Supabase service role + JWT, Vercel, Clerk (publishable + secret), Linear, Notion, Figma, HuggingFace, PyPI, Shopify, PlanetScale, Netlify, Asana, Atlassian, Twilio, SendGrid, Mailgun, Mailchimp, Dropbox, Square, Heroku, database connection URLs with embedded credentials, private keys, and more
Confidence + risk scoring — each finding has severity, confidence (0-100), and a composite risk_score; gate CI on --min-confidence
Finding fingerprints — stable 16-char IDs for dedup, baselining, and SARIF partialFingerprints
Git history / staged / diff scanning — pre-commit ready, with --diff-filter=ACMR
Reporters — terminal, JSON, CSV, SARIF 2.1.0 (with security-severity), Markdown (PR-ready), and GitHub Actions workflow-command annotations
GitHub-native integration — --pr-comment (create-or-update with marker), --github-step-summary, --github-annotations, auto-detected inside Actions
Triage workflows — explain findings by fingerprint, install pre-commit hooks, generate .env.example placeholders, group monorepo results, and run safe provider verification
Dashboard/editor outputs — single-file HTML dashboard and VS Code problem-matcher friendly output
Low false-positive rate — filters for placeholders, UUIDs, hex blobs, regex source, template/format strings, URLs without credentials, and inline # entro-scan: ignore directives (ignore, ignore-line, ignore-next-line)
Smart file walker — .gitignore honored, binary sniffer, configurable max_file_size_mb, generated reports (scan.json, scan.sarif, baselines) auto-skipped
Redacted baselines — by default baselines store sha256 + fingerprint only; opt-in raw storage with --no-redact-baseline
Configurable — TOML config (.entro-scan.toml or [tool.entro-scan] in pyproject.toml)
Zero dependencies — pure Python 3.11+ standard library, including the GitHub API client
Pre-commit hook ready — --git-staged --fail-on-severity critical
CI/CD friendly — --fail-on-severity, exit codes 0/1/2

Installation

Recommended: pipx

Install entro-scan as an isolated global CLI tool.

pipx install entro-scan
entro-scan --help

Python / pip

pip install entro-scan

npm / npx

npx entro-scan

The npm package currently acts as a lightweight wrapper/helper around the Python CLI version.

Install from source

git clone https://github.com/vyofgod/entro-scan.git
cd entro-scan
pip install -e .

Usage

# Scan current directory
entro-scan

# Scan a specific path
entro-scan /path/to/project

# Custom entropy threshold (lower = more findings)
entro-scan /path --threshold 4.0

# Output in JSON format
entro-scan /path --format json

# Output to file
entro-scan /path --format json -o results.json

# Scan git history (last 100 commits)
entro-scan /path --git

# Scan git history with custom depth
entro-scan /path --git --max-commits 500

# Scan only staged git files (--staged alias)
entro-scan /path --staged

# Scan only modified git files (--diff alias)
entro-scan /path --diff

# Parallel scan with 8 workers
entro-scan /path --workers 8

# Quiet mode (only findings, no banner)
entro-scan /path --quiet

# Mask secrets in output (default: true, --no-mask-secrets to disable)
entro-scan /path --mask-secrets

# Show unmasked secrets
entro-scan /path --no-mask-secrets

# Fail CI if critical findings are found (--fail-on-findings alias)
entro-scan /path --fail-on-severity critical

# Use a baseline file to ignore known findings
entro-scan /path --baseline .entro-scan.baseline.json

# Save current findings as a new baseline (--update-baseline alias)
entro-scan /path --save-baseline .entro-scan.baseline.json

# Generate a default config file
entro-scan --init

# Markdown PR-style report
entro-scan . --format markdown -o entro-scan-report.md

# Single-file HTML dashboard report
entro-scan . --format html -o entro-scan-report.html

# VS Code / problem matcher friendly output
entro-scan . --format vscode

# GitHub Actions: annotations + PR comment + step summary (all auto-on in Actions)
entro-scan . --github-annotations --github-step-summary --pr-comment

# Filter low-confidence noise
entro-scan . --min-confidence 70

# Disable .gitignore honoring
entro-scan . --no-use-gitignore

# Skip files larger than 1 MB
entro-scan . --max-file-size 1

# Apply scan profiles
entro-scan . --profile ci
entro-scan . --profile paranoid
entro-scan . --profile frontend
entro-scan . --profile backend

# Only report findings not present in a baseline
entro-scan . --only-new --baseline .entro-scan.baseline.json

# Attach git blame / first-seen metadata
entro-scan . --blame --format json

# Group monorepo findings
entro-scan . --group-by package

# Verify supported provider tokens (GitHub, OpenAI, Slack bot tokens)
entro-scan verify .

AI-Triaged Scanning

New in v1.2.0: Use the local AI model to automatically classify findings and get pattern-specific recommendations.

# Setup: check system requirements and download qwen2.5:0.5b (398 MB)
entro-scan ai

# Scan with AI triage (REAL vs NOISE) + pattern-specific advice
entro-scan ai /path/to/project

# Force model re-download or setup
entro-scan ai --setup

The AI triage:

Detects common false positives (CSS values, test files, minified JS, placeholders)
Classifies findings as REAL (rotate) or NOISE (ignore)
Generates actionable recommendations per pattern (e.g., "Rotate Google API key in Cloud Console, restrict by referrer")
Analyzes repo context to infer project type and credential risk level

Requires Ollama + ~1 GB free RAM (install automatically on first run).

Inline ignore directives

Add a comment on the same line, or the line above, to suppress a finding:

TOKEN = "ghp_..." # entro-scan: ignore
# entro-scan: ignore-next-line
TOKEN = "ghp_..."

Supported comment leaders: #, //, /* */, --.

Workflow Commands

# Explain why a finding was reported
entro-scan explain abc123def4567890 .

# Include first-seen git metadata and safe provider verification in the explanation
entro-scan explain abc123def4567890 . --blame --verify

# Review findings with remediation and fingerprints
entro-scan fix .

# Accept current findings into a redacted baseline
entro-scan fix . --action baseline --baseline .entro-scan.baseline.json

# Add placeholder keys to .env.example for detected providers
entro-scan fix . --action env-example

# Install a pre-commit hook that scans staged files
entro-scan install-hook

Exit Codes

Code	Meaning
0	Success, no findings or findings don't meet severity threshold
1	Error (invalid config, file not found, etc.)
2	Blocking findings detected

Output Formats

Terminal (default)

Color-coded output with severity levels:

Red (score > 4.5): Critical — likely a secret
Yellow (score > 3.9): High — suspicious
Green (score <= 3.9): Medium — low-confidence finding

JSON

Machine-readable output for CI/CD pipelines.

CSV

Spreadsheet-friendly output for reporting.

SARIF

Static Analysis Results Interchange Format 2.1.0 — compatible with GitHub code scanning. Each rule includes a security-severity property and each result carries a stable partialFingerprints entry so GitHub deduplicates findings across scans.

Markdown

Pull-request friendly summary with severity badges, a table, and a remediation appendix. Used by --pr-comment and --github-step-summary.

HTML

Single-file dashboard with severity cards, filter buttons, remediation text, and optional verification / first-seen metadata.

VS Code

Problem-matcher friendly file:line:column: severity: message output for editor and CI log integrations.

GitHub Annotations

Emits ::error file=...:: workflow commands so GitHub renders inline annotations on the PR diff. Auto-enabled when running inside Actions, or via --github-annotations.

Configuration

Create .entro-scan.toml in your project root:

threshold = 3.5
workers = 2
quiet = false
verbose = false
output_format = "terminal"
git_enabled = false
git_staged = false
git_diff = false
max_commits = 100
mask_secrets = true
# fail_on_severity = "critical"  # Options: critical, high, medium, any
# baseline_path = ".entro-scan.baseline.json"

exclude_dirs = [
    ".git", "node_modules", "venv", "__pycache__",
    ".idea", ".vscode", "build", "dist", "target",
]

exclude_files = [
    "package-lock.json", "yarn.lock", "pnpm-lock.yaml",
    "cargo.lock", "go.sum", "poetry.lock",
]

include_extensions = [
    ".py", ".rs", ".js", ".ts", ".go", ".java", ".kt", ".swift",
    ".rb", ".php", ".sh", ".json", ".yaml", ".yml", ".toml", ".env",
]

# Allowlist patterns (text containing these will be ignored)
# allowlist_patterns = [
#     "test-secret",
#     "example-key",
# ]

# Allowlist hashes (sha256 of the secret text)
# allowlist_hashes = [
#     "abc123...",
# ]

Alternatively, config can live under [tool.entro-scan] in your pyproject.toml.

Supported Patterns

Pattern	Severity
JWT (JSON Web Tokens)	Critical
AWS Access Key ID	Critical
AWS Secret Key	Critical
GitHub Token	Critical
Slack Token	Critical
Private Keys (RSA/DSA/EC/OpenSSH)	Critical
Stripe API Key	Critical
Mailchimp API Key	Critical
SendGrid API Key	Critical
Dropbox API Key	Critical
PayPal Braintree Access Token	Critical
NPM Token	Critical
Docker Hub Token	Critical
OpenAI API Key	Critical
Anthropic API Key	Critical
Supabase Key	Critical
Vercel Token	Critical
Linear API Key	Critical
Facebook Access Token	Critical
GitLab Token	High
Heroku API Key	High
Database URLs (Postgres, MySQL, MongoDB, Redis, SQLite, MariaDB, Oracle)	High
Square Access Token	High
Square OAuth Secret	High
Twitter API Key	High
Twitter Access Token	High
Google API Key	High
Twilio API Key	High
Twilio Account SID	High
Basic Auth	High
Clerk Key	High
API Key in URL	Medium
Generic API Keys / Secrets	Medium

Pre-commit Hook

Add to your .pre-commit-config.yaml:

repos:
  - repo: https://github.com/vyofgod/entro-scan
    rev: v1.0.0
    hooks:
      - id: entro-scan
        args: ["--staged", "--fail-on-severity", "critical"]

CI/CD Integration

GitHub Actions

Create .github/workflows/entro-scan.yml:

name: Secret Scan

on:
  push:
    branches: [main, master]
  pull_request:
    branches: [main, master]

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.11"
          
      - name: Install entro-scan
        run: pip install entro-scan
        
      - name: Run entro-scan
        run: |
          entro-scan . \
            --format sarif \
            -o entro-scan-results.sarif \
            --fail-on-severity critical
            
      - name: Upload SARIF to GitHub
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: entro-scan-results.sarif

Or use the official GitHub Action — it wires PR comments, annotations, the step summary, and SARIF upload in one step:

name: Secret Scan
on:
  pull_request:
  push:
    branches: [main]

permissions:
  contents: read
  pull-requests: write   # for --pr-comment
  security-events: write # for SARIF upload to code scanning

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0   # full history for --git
      - name: Run entro-scan
        uses: vyofgod/entro-scan@v1
        with:
          fail-on-severity: high
          format: sarif
          output: entro-scan.sarif
          pr-comment: "true"
          github-annotations: "true"
          step-summary: "true"
          upload-sarif: "true"

The action exports findings_count and critical_count outputs you can branch on in downstream steps.

Development

# Install dev dependencies
pip install pytest ruff

# Run tests
pytest tests/ -v

# Lint
ruff check .

# Type check
mypy entro_scan/

Why entropy?

Secrets like API keys, tokens, and passwords are typically random strings with high entropy (information density). Natural language text and code identifiers have much lower entropy.

By measuring the Shannon entropy of strings in your codebase, entro-scan can flag potential secrets with high accuracy.

License

MIT — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.2.4

May 28, 2026

This version

1.2.3

May 28, 2026

1.2.2

May 28, 2026

1.2.1

May 28, 2026

1.2.0

May 28, 2026

1.0.0

May 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

entro_scan-1.2.3.tar.gz (63.6 kB view details)

Uploaded May 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

entro_scan-1.2.3-py3-none-any.whl (56.2 kB view details)

Uploaded May 28, 2026 Python 3

File details

Details for the file entro_scan-1.2.3.tar.gz.

File metadata

Download URL: entro_scan-1.2.3.tar.gz
Upload date: May 28, 2026
Size: 63.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for entro_scan-1.2.3.tar.gz
Algorithm	Hash digest
SHA256	`18587fad6995ee8a5167f6309e32b7afe6979211a055ade6260ae1db546cfad3`
MD5	`3c876f7b57f0b692c3af342ff6233ab0`
BLAKE2b-256	`124593b39e57be762d6a5ec83bf28d8d6675c167f42b37e4c9ce0a4bf14c54f4`

See more details on using hashes here.

File details

Details for the file entro_scan-1.2.3-py3-none-any.whl.

File metadata

Download URL: entro_scan-1.2.3-py3-none-any.whl
Upload date: May 28, 2026
Size: 56.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for entro_scan-1.2.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9d3492d03e27df204659719819be7f17e33ab3bcbc31fdb18b30cce608581e7d`
MD5	`67bb8dff21acb8520e22d8a4b0fe3a79`
BLAKE2b-256	`fa8271700ba96667c7b6932ef158f31dda16e5da5a54753f91b9d0b4dffd6138`

See more details on using hashes here.

entro-scan 1.2.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

entro-scan

Demo

Features

Installation

Recommended: pipx

Python / pip

npm / npx

Install from source

Usage

AI-Triaged Scanning

Inline ignore directives

Workflow Commands

Exit Codes

Output Formats

Terminal (default)

JSON

CSV

SARIF

Markdown

HTML

VS Code

GitHub Annotations

Configuration

Supported Patterns

Pre-commit Hook

CI/CD Integration

GitHub Actions

Development

Why entropy?

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes