AI-powered HIPAA compliance scanner for healthcare software teams

These details have not been verified by PyPI

Project description

HIPAA-Guard

A multi-agent AI system that scans healthcare software codebases for HIPAA violations at commit time. Catches PHI exposure, hardcoded FHIR credentials, overly broad OAuth scopes, missing audit controls, and more — before code ships.

How It Works

HIPAA-Guard runs a 7-agent pipeline triggered by git pre-commit hooks, CI/CD events, or manual CLI commands:

[Trigger] → Scanner → Analyst → Fix Agent ──────────────────→ Verifier
                                     │                              │
                               Reviewer Agent ← REJECT/MODIFY      │
                               (adversarial)                        │
                                     │                              │
                               APPROVE ────────────────────→ [Clean? Loop max 3x]

                    Educator Agent (parallel — regulatory citations)
                    Trend Agent (weekly posture reports + drift detection)

Agent	Role	LLM?
Scanner	Regex + AST detection across 15 HIPAA rules	No — pure Python, <10s
Analyst	Disambiguates ambiguous findings (test fixture vs real PHI?)	Yes
Fix Agent	Proposes minimal diffs for confirmed violations	Yes
Reviewer	Adversarially critiques proposed fixes — finds problems, not validates	Yes
Verifier	Re-scans fixed file in memory to confirm resolution, no new issues	No
Educator	Plain-English regulatory explanation with CFR citation per finding	Yes
Trend	30-day posture tracking, drift detection, weekly reports	Yes

What It Detects

Rule	Description	Severity
CRED-001	Hardcoded FHIR `client_secret`	CRITICAL
CRED-002	Hardcoded API key or bearer token	CRITICAL
CRED-003	FHIR URL with embedded credentials	CRITICAL
FHIR-003	FHIR endpoint over HTTP (not HTTPS)	CRITICAL
PHI-001	SSN pattern in source code	HIGH
PHI-003	MRN in string literal	HIGH
PHI-004	Patient name passed directly to logger	HIGH
FHIR-001	SMART on FHIR `patient/*` wildcard scope	HIGH
FHIR-002	SMART on FHIR `user/*` wildcard scope	HIGH
AUDIT-001	PHI accessed in function without audit log call	HIGH
PHI-002	Date of birth in string literal	MEDIUM
PHI-006	PHI interpolated in f-string	MEDIUM
AUDIT-002	PHI returned in HTTP response without encryption check	MEDIUM
ENC-001	PHI written to unencrypted file	HIGH
PHI-005	NPI number in source code	MEDIUM

Quick Start

Requirements: Python 3.9+

# Clone and scan a directory
git clone https://github.com/slindelow/hipaa-guard
cd hipaa-guard

python3 cli.py scan /path/to/your/healthcare/app

# Install pre-commit hook (blocks commits with CRITICAL/HIGH violations)
python3 cli.py install-hook

# Full repo audit + compliance document
python3 cli.py audit

# Check dependency BAA status
python3 cli.py baa

Enable AI pipeline (Analyst, Fix Agent, Educator, Reviewer):

export ANTHROPIC_API_KEY=your_key_here
python3 cli.py scan /path/to/your/app

Without an API key, the scanner runs standalone — fast, no API calls, catches all high-confidence violations.

PHI Safety Architecture

The system reads source code that may contain PHI. Three layers prevent leakage:

phi_safe_handler.py wraps all file I/O — PHI values are replaced with typed placeholders ([REDACTED_SSN], [REDACTED_MRN]) before reaching any log, state file, or LLM prompt. Code structure is preserved; values are not.
Ephemeral session state — scan state lives in a UUID-scoped temp directory, deleted on completion. Never committed, never uploaded.
LLM prompts use minimum context — agents receive ±50 lines around a finding with PHI redacted. The system prompt for all LLM agents explicitly prohibits PHI reconstruction.

HIPAA BAA note: If your Anthropic account does not have a signed HIPAA BAA, set "has_hipaa_baa": false in hipaa-guard.config.json. The system will route all LLM analysis through a local Ollama endpoint instead, keeping all data on-premise.

Differentiating Features

Dependency BAA Checker

Parses your requirements.txt / package.json / pyproject.toml, cross-references a maintained BAA registry, and identifies which packages actually touch PHI-handling code. No existing tool does this automatically.

python3 cli.py baa

FHIR Vendor Profile Validation

Validates FHIR resource structures against Epic, Cerner, and base R4 profiles. Catches vendor-specific extension gaps at commit time instead of during EHR sandbox testing (where they cost days to debug).

Auto-Generated Compliance Documentation

On demand or on release tag, generates a structured HIPAA Technical Safeguards Summary — PHI access points, encryption usage, audit log coverage, FHIR endpoints — mapped to CFR sections. A compliance officer-ready draft in seconds.

False Positive Learning Loop

Mark a finding as a false positive:

python3 cli.py fp <finding_id> --reason "test fixture"

After 30 days, rules with >30% FP rate have their confidence thresholds automatically adjusted. The scanner self-calibrates without any model fine-tuning.

Configuration

hipaa-guard.config.json — project-level settings (commit this):

{
  "fhir_targets": ["epic", "cerner"],
  "llm": {
    "provider": "anthropic",
    "model": "claude-sonnet-4-6",
    "has_hipaa_baa": false
  },
  "scan": {
    "onboarding_mode": true
  }
}

Set onboarding_mode: true on first run — findings are reported but commits are not blocked while you triage false positives.

config/rule_config.json — per-rule severity, risk tier, confidence thresholds. Updated automatically by the FP feedback loop.

CLI Reference

python3 cli.py scan <path> [path2 ...]   Scan files or directories
python3 cli.py audit                     Full repo audit + compliance doc
python3 cli.py fp <finding_id>           Mark finding as false positive
python3 cli.py fn <file> <line>          Report a missed finding
python3 cli.py install-hook              Install pre-commit git hook
python3 cli.py baa                       Check dependency BAA status

Project Structure

cli.py                      Entry point
hipaa-guard.config.json     Project config (commit this)
config/
  rule_config.json          Rule thresholds and risk tiers
  baa_registry.json         Package → BAA status registry
tools/
  scanner.py                Core detection engine (no LLM)
  phi_pattern_library.py    15 HIPAA detection rules
  phi_safe_handler.py       PHI redaction for all file I/O
  state_manager.py          Typed shared state between agents
  baa_checker.py            Dependency BAA analysis
  compliance_doc_generator.py  HIPAA safeguard summary generator
  git_hook_installer.py     Pre-commit hook setup
  rule_config_updater.py    FP feedback → confidence adjustment
agent/
  orchestrator.py           Deterministic pipeline dispatcher
  analyst_agent.py          LLM: disambiguates ambiguous findings
  fix_agent.py              LLM: proposes minimal diffs
  reviewer_agent.py         LLM: adversarial fix critique
  educator_agent.py         LLM: CFR citations + plain-English explanations
  verifier.py               Re-scans fixed files to confirm resolution
  trend_agent.py            Posture tracking + weekly reports
workflows/
  pre_commit_scan.md        SOP: pre-commit trigger
  fix_loop.md               SOP: Fix → Reviewer → Verifier loop
  manual_audit.md           SOP: full repo audit
  false_positive_intake.md  SOP: FP recording and feedback
  weekly_trend_report.md    SOP: weekly posture report

Architecture: WAT Framework

Built on the WAT (Workflows, Agents, Tools) pattern:

Workflows (workflows/) — Markdown SOPs defining what to do and how
Agents (agent/) — AI orchestration and reasoning
Tools (tools/) — Deterministic Python scripts for execution

Probabilistic AI handles reasoning; deterministic code handles execution. Agent failures fall conservatively (e.g., Analyst failure → treat all findings as confirmed violations).

Disclaimer

HIPAA-Guard identifies code-level compliance gaps. It is not a certified HIPAA compliance audit and does not constitute legal advice. Consult a qualified HIPAA compliance officer before certifying compliance. Scan results should be retained per 45 CFR §164.316(b)(2) requirements.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

May 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hipaa_guard-0.1.0-py3-none-any.whl (49.9 kB view details)

Uploaded May 15, 2026 Python 3

File details

Details for the file hipaa_guard-0.1.0-py3-none-any.whl.

File metadata

Download URL: hipaa_guard-0.1.0-py3-none-any.whl
Upload date: May 15, 2026
Size: 49.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for hipaa_guard-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`202bd60e1537b9af348b6f88ebfd553de05c7bf9a324fef4423fa5f5c30f4b9f`
MD5	`104d9cba7360ed8f778bf06c5679f15b`
BLAKE2b-256	`76dacc2fce6520759077b3a7a2e5b0b560f680bd23c4f25ff97bc15d111f27df`

See more details on using hashes here.

hipaa-guard 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers