Skip to main content

Runtime privacy leak scanner - detects and optionally blocks sensitive data leakage

Project description

ahh-leek

Runtime Privacy Leak Scanner - Detects and optionally blocks sensitive data leakage at runtime.

Ko-fi

Buy Me A Coffee

Features

  • Secret Detection: API keys, tokens, credentials, private keys
  • PII Detection: Credit cards (Luhn validated), SSN, email, phone numbers
  • Exfiltration Detection: Base64-encoded secrets, env var exposure, large payloads
  • Multiple Modes: Alert, Block, or Redact sensitive data
  • Runtime Hooks: Intercepts network requests, logging, and file writes
  • Zero False Positives: Entropy validation, Luhn algorithm, format validation
  • CLI Tool: Scan files or wrap scripts with leak detection

Installation

pip install ahh-leek

Or install from source:

pip install -e .

Quick Start

Library Mode

import ahhleek

# Enable with defaults (alert only)
ahhleek.enable()

# Enable with blocking (raises exception on leak)
ahhleek.enable(mode="block")

# Custom configuration
ahhleek.enable(
    detect=["secrets", "pii"],  # Categories to scan
    mode="alert",               # alert | block | redact
    on_leak=my_callback,        # Custom handler
    exclude=[r"/health", r"localhost"],  # URLs to ignore
    confidence_threshold=0.8,   # Min confidence (0.0-1.0)
)

# Disable when done
ahhleek.disable()

One-Shot Scanning

import ahhleek

# Scan a string
result = ahhleek.scan("my api key is sk_live_abc123xyz456")

if result.has_leaks:
    for leak in result:
        print(f"Found: {leak.pattern_name}")
        print(f"  Redacted: {leak.redacted_value}")
        print(f"  Confidence: {leak.confidence:.0%}")

CLI Mode

# Scan a file
ahh-leek scan config.py

# Scan text directly
ahh-leek scan --text "AKIAIOSFODNN7EXAMPLE"

# JSON output
ahh-leek scan file.py --json

# Run a script with leak detection
ahh-leek run script.py --mode alert

# Run with blocking enabled
ahh-leek run script.py --mode block --detect secrets --detect pii

Detection Patterns

Secrets (High Confidence)

Pattern Example
AWS Access Key AKIAIOSFODNN7EXAMPLE
AWS Secret Key wJalrXUtnFEMI/K7MDENG...
GitHub Token ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Slack Token xoxb-1234567890-abcdefghijk
Stripe Key sk_live_1234567890abcdefghij
JWT Token eyJhbGciOiJIUzI1NiIs...
Private Key -----BEGIN RSA PRIVATE KEY-----
Database URI postgres://user:pass@host/db

PII (Validated)

Pattern Validation
Credit Card Luhn algorithm checksum
SSN Format + range validation
Email Format validation
Phone Country code patterns

Exfiltration

  • Base64-encoded secrets
  • Sensitive env vars in outbound data
  • Large POST payloads
  • Writes to suspicious paths (/tmp, /public, etc.)

Operating Modes

Alert Mode (Default)

Logs warnings but allows data through:

[ahh-leek] LEAK DETECTED: aws_access_key
  Location: requests.post() -> api.example.com
  Match: AKIA****************
  Confidence: 95%
  Action: Allowed (alert mode)

Block Mode

Raises LeakBlockedError to prevent data from leaving:

ahhleek.enable(mode="block")

try:
    requests.post(url, data={"key": "AKIAIOSFODNN7EXAMPLE"})
except ahhleek.LeakBlockedError as e:
    print(f"Blocked: {e.event.pattern_name}")

Redact Mode

Automatically replaces sensitive data with placeholders:

ahhleek.enable(mode="redact")

# "sk_live_abc123" becomes "[REDACTED:stripe_secret_key]"

Configuration Options

ahhleek.enable(
    # Detection categories
    detect=["secrets", "pii", "exfil"],

    # Operating mode
    mode="alert",  # alert | block | redact

    # Custom leak handler
    on_leak=lambda event: print(f"Leak: {event}"),

    # URL patterns to exclude (regex)
    exclude=[r"localhost", r"/health"],

    # Patterns to allowlist (won't trigger)
    allowlist=[r"test_", r"example\.com"],

    # Minimum confidence threshold
    confidence_threshold=0.7,

    # Hook controls
    hook_network=True,   # Intercept HTTP requests
    hook_logging=True,   # Intercept logging/print
    hook_files=False,    # Intercept file writes (more invasive)
)

Hooks

Network Hooks

Intercepts outbound network requests:

  • urllib.request.urlopen
  • requests.Session.request
  • httpx.Client.request / AsyncClient.request
  • aiohttp.ClientSession._request
  • socket.socket.send / sendall

Logging Hooks

Intercepts log output:

  • logging.Handler.emit
  • builtins.print
  • sys.stdout / sys.stderr

File Hooks

Monitors file writes (disabled by default):

  • builtins.open (write modes)
  • pathlib.Path.write_text / write_bytes

Zero False Positive Strategy

  1. Entropy Validation: Generic patterns require high entropy (>3.5 bits/char)
  2. Luhn Algorithm: Credit cards must pass checksum validation
  3. Format + Range: SSN must have valid area/group/serial
  4. Allowlists: User-defined patterns to ignore
  5. Confidence Scoring: Only alert above threshold

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run specific test file
pytest tests/test_patterns.py -v

Examples

Custom Callback

def my_leak_handler(event):
    # Send to security logging system
    security_log.warning(
        "Data leak detected",
        extra={
            "pattern": event.pattern_name,
            "location": event.location,
            "confidence": event.confidence,
        }
    )

ahhleek.enable(on_leak=my_leak_handler)

Context Manager

from contextlib import contextmanager

@contextmanager
def leak_protection():
    ahhleek.enable(mode="block")
    try:
        yield
    finally:
        ahhleek.disable()

with leak_protection():
    # All network/logging activity is monitored
    requests.post(url, data=sensitive_data)

Integration with Web Frameworks

# Flask middleware
@app.before_request
def enable_leak_detection():
    if app.config.get("LEAK_DETECTION"):
        ahhleek.enable(mode="alert")

@app.teardown_request
def disable_leak_detection(exception):
    ahhleek.disable()

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ahh_leek-0.1.0.tar.gz (26.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ahh_leek-0.1.0-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file ahh_leek-0.1.0.tar.gz.

File metadata

  • Download URL: ahh_leek-0.1.0.tar.gz
  • Upload date:
  • Size: 26.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ahh_leek-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2fab2fafd3d91830769ca86766162c1545c8fd1f47fcef19dc44fe82f8c31660
MD5 7013bac6d7edc0f06adce911964877e7
BLAKE2b-256 c4441a680904d4693ea070d00e8e8464f9a56e8fdbf97354608ec44208c425e9

See more details on using hashes here.

File details

Details for the file ahh_leek-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ahh_leek-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ahh_leek-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4e2bbdb9968317d77670f23bca00d4712ea51676abd3d7f95cc3c79b3e954065
MD5 357029bb503a3a1355229dd7286a2c36
BLAKE2b-256 5ca09dbeb53a7c3888fdb83b49053c23025f2762beffe129eb1a1d65be53f124

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page