Skip to main content

A CLI tool to scan files for configurable regex patterns (PHI identifiers) and optionally replace matches with deterministic pseudonyms

Project description

GitHub

Scan files for PHI (Protected Health Information) patterns and replace them with deterministic pseudonyms. Integrates seamlessly with pre-commit hooks.

Installation

pip install shred-guard
# or with uv
uv add shred-guard

Quick Start

Run the interactive setup wizard:

shredguard init

This walks you through:

  • Selecting PHI patterns to detect (SSNs, emails, MRNs, custom patterns)
  • Configuring file restrictions
  • Setting up pre-commit hooks

Commands

shredguard init

Interactive setup wizard. Creates your configuration and optionally sets up pre-commit integration.

shredguard check

Scan for PHI patterns:

shredguard check .                    # Scan current directory
shredguard check data/ notes.txt     # Scan specific paths

Output uses ruff-style formatting:

patient_notes.txt:1:9: SG001 Subject ID [SUB-1234]
patient_notes.txt:2:6: SG002 SSN [123-45-6789]

shredguard fix

Replace PHI with pseudonyms:

shredguard fix .                                    # Replace with REDACTED-0, REDACTED-1, ...
shredguard fix --prefix ANON .                     # Custom prefix: ANON-0, ANON-1, ...
shredguard fix --output-map mapping.json .         # Save original -> pseudonym mapping

Replacements are deterministic: the same value always gets the same pseudonym within a run.

shredguard audit

Scan every commit on every local branch for PHI patterns:

shredguard audit                          # Audit all local branches
shredguard audit --include-remotes        # Also scan remote-tracking branches
shredguard audit --output report.json     # Custom output file path

Configuration and .gitignore are locked to the current working-tree state so results are reproducible. The config and .gitignore files must have no uncommitted changes before running. Output is written to a timestamped JSON file (shredguard-audit-<timestamp>.json).

Configuration

Configuration lives in pyproject.toml (or a standalone shredguard.toml):

[tool.shredguard]

[[tool.shredguard.patterns]]
regex = "SUB-\\d{4,6}"
description = "Subject ID"

[[tool.shredguard.patterns]]
regex = "\\b\\d{3}-\\d{2}-\\d{4}\\b"
description = "SSN"

Each pattern can optionally include files and exclude_files globs to control which files are scanned.

Pre-commit

Add to .pre-commit-config.yaml:

repos:
  - repo: local
    hooks:
      - id: shredguard-check
        name: shredguard check
        entry: shredguard check
        language: system
        types: [text]

Or let shredguard init set this up for you.

CLI Reference

shredguard check [OPTIONS] [FILES]...

Option Description
--all-files Scan all files recursively
--no-gitignore Don't respect .gitignore patterns
--config PATH Path to config file
-v, --verbose Show verbose output (skipped files, etc.)

shredguard fix [OPTIONS] [FILES]...

Option Description
--prefix TEXT Prefix for pseudonyms (default: REDACTED)
--output-map PATH Write JSON mapping of originals to pseudonyms
--all-files Scan all files recursively
--no-gitignore Don't respect .gitignore patterns
--config PATH Path to config file
-v, --verbose Show verbose output

shredguard audit [OPTIONS]

Option Description
--include-remotes Also scan remote-tracking branches
--output PATH Path for audit JSON output (default: shredguard-audit-<timestamp>.json)
--no-gitignore Don't respect .gitignore patterns
--config PATH Path to config file
-v, --verbose Show verbose output (skipped binary files, etc.)

Exit Codes

Code Meaning
0 Success (no matches found)
1 Matches found or error

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shred_guard-1.1.0.tar.gz (795.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shred_guard-1.1.0-py3-none-any.whl (24.3 kB view details)

Uploaded Python 3

File details

Details for the file shred_guard-1.1.0.tar.gz.

File metadata

  • Download URL: shred_guard-1.1.0.tar.gz
  • Upload date:
  • Size: 795.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for shred_guard-1.1.0.tar.gz
Algorithm Hash digest
SHA256 ef618030f7275e566ae51f071a470620656a3c6aeb48ed81d09be3d9a34cae66
MD5 a5f980a37a687f9d8939599861bc919a
BLAKE2b-256 bbe7ed7dc67ec3fb503a1f8a75f768c56c0011c8dfc35314dd014bbc97db1104

See more details on using hashes here.

Provenance

The following attestation bundles were made for shred_guard-1.1.0.tar.gz:

Publisher: cd.yml on WISCLab/shred-guard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file shred_guard-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: shred_guard-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 24.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for shred_guard-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1b4a10b335fc13ed800d2f26720373f3a4a1875985d4d889fc9e7fdc5c9b8351
MD5 1089de42162c0dd10da87a8727665970
BLAKE2b-256 78ce26aeb40edea3f8dcbad3ba41aec82e07943fe7f3bf84a74b966fbeae0f48

See more details on using hashes here.

Provenance

The following attestation bundles were made for shred_guard-1.1.0-py3-none-any.whl:

Publisher: cd.yml on WISCLab/shred-guard

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page