Skip to main content

AI Compliance Middleware โ€” PII protection, tamper-evident audit logs, and EU AI Act compliance for LLM gateways

Project description

๐Ÿ›ก๏ธ CloakLLM

Cloak your prompts. Prove your compliance.

Every prompt you send to an LLM provider is visible in plaintext โ€” names, emails, SSNs, API keys, medical records. CloakLLM intercepts, cloaks, and audits every call.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Your App   โ”‚โ”€โ”€โ”€โ–ถโ”‚     CLOAKLLM        โ”‚โ”€โ”€โ”€โ–ถโ”‚  Claude/GPT  โ”‚
โ”‚              โ”‚    โ”‚                     โ”‚     โ”‚  /Gemini     โ”‚
โ”‚  "Email      โ”‚    โ”‚  "Email [PERSON_0]  โ”‚     โ”‚              โ”‚
โ”‚   john@..."  โ”‚    โ”‚   [EMAIL_0]..."     โ”‚     โ”‚  Never sees  โ”‚
โ”‚              โ”‚โ—€โ”€โ”€โ”€โ”‚                     โ”‚โ—€โ”€โ”€โ”€โ”‚  real data   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                          โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  Hash-Chain  โ”‚
                    โ”‚  Audit Log   โ”‚
                    โ”‚  (EU AI Act) โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
> **Also available for JavaScript/TypeScript:** `npm install cloakllm` โ€” zero dependencies, OpenAI SDK integration. See [CloakLLM JS](https://github.com/cloakllm/CloakLLM-JS). | [Project Hub](https://github.com/cloakllm/CloakLLM)

โฐ Why Now?

EU AI Act enforcement begins August 2, 2026. Article 12 requires tamper-evident audit logs that regulators can mathematically verify. Non-compliance: up to 7% of global annual revenue.

Your current logging (logger.info()) won't survive an audit. CloakLLM provides:

  • ๐Ÿ”’ PII Detection โ€” Names, emails, SSNs, API keys, IPs, credit cards, IBANs via NER + regex
  • ๐ŸŽญ Context-Preserving Cloaking โ€” John Smith โ†’ [PERSON_0] (the LLM still understands the prompt)
  • โ›“๏ธ Tamper-Evident Audit Chain โ€” Every event hash-linked. Any tampering breaks the chain.
  • โšก One-Line Middleware โ€” Drop-in protection for OpenAI SDK and LiteLLM (100+ providers)

๐Ÿš€ Quick Start

Install

pip install cloakllm                  # standalone usage
pip install cloakllm[litellm]         # with LiteLLM integration
python -m spacy download en_core_web_sm

Option A: With OpenAI SDK (one line)

from cloakllm import enable_openai
from openai import OpenAI

client = OpenAI()
enable_openai(client)  # Done. All calls are now cloaked.

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Email john@acme.com about Project X"}]
)
# Provider never sees "john@acme.com" โ€” only "[EMAIL_0]"
# Response is automatically uncloaked before you see it

Option B: With LiteLLM (one line)

import cloakllm
cloakllm.enable()  # Done. All LiteLLM calls are now cloaked.

import litellm
response = litellm.completion(
    model="anthropic/claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Email john@acme.com about Project X"}]
)
# Provider never sees "john@acme.com" โ€” only "[EMAIL_0]"
# Response is automatically uncloaked before you see it

Option C: Standalone

from cloakllm import Shield

shield = Shield()

# Cloak
cloaked, token_map = shield.sanitize(
    "Send report to john@acme.com, SSN 123-45-6789"
)
# cloaked: "Send report to [EMAIL_0], SSN [SSN_0]"

# ... send cloaked prompt to any LLM ...

# Uncloak response
clean = shield.desanitize(llm_response, token_map)

Redaction Mode (irreversible)

from cloakllm import Shield, ShieldConfig

shield = Shield(ShieldConfig(mode="redact"))
redacted, _ = shield.sanitize("Email john@acme.com about Sarah Johnson")
# redacted: "Email [EMAIL_REDACTED] about [PERSON_REDACTED]"
# No token map stored โ€” cannot be reversed

Entity Details (compliance metadata)

from cloakllm import Shield

shield = Shield()
sanitized, token_map = shield.sanitize("Email john@acme.com, SSN 123-45-6789")

# Per-entity metadata (no original text โ€” PII-safe)
token_map.entity_details
# [
#   {"category": "EMAIL", "start": 6, "end": 19, "length": 13, "confidence": 0.95, "source": "regex", "token": "[EMAIL_0]"},
#   {"category": "SSN", "start": 25, "end": 36, "length": 11, "confidence": 0.95, "source": "regex", "token": "[SSN_0]"}
# ]

# Full report for dashboards
token_map.to_report()
# {"entity_count": 2, "categories": {...}, "tokens": [...], "mode": "tokenize", "entity_details": [...]}

Option D: CLI

# Scan text for sensitive data
python -m cloakllm scan "Email john@acme.com, SSN 123-45-6789"

# Verify audit chain integrity
python -m cloakllm verify ./cloakllm_audit/

# View audit statistics
python -m cloakllm stats ./cloakllm_audit/

โ›“๏ธ Tamper-Evident Audit Chain

Every cloaking event is recorded in a hash-chained append-only log:

{
  "seq": 42,
  "event_id": "a1b2c3d4-...",
  "timestamp": "2026-02-27T14:30:00+00:00",
  "event_type": "sanitize",
  "model": "claude-sonnet-4-20250514",
  "entity_count": 3,
  "categories": {"PERSON": 1, "EMAIL": 1, "SSN": 1},
  "tokens_used": ["[PERSON_0]", "[EMAIL_0]", "[SSN_0]"],
  "prompt_hash": "sha256:9f86d0...",
  "sanitized_hash": "sha256:a3f2b1...",
  "entity_details": [
    {"category": "PERSON", "start": 0, "end": 10, "length": 10, "confidence": 0.85, "source": "spacy", "token": "[PERSON_0]"},
    {"category": "EMAIL", "start": 12, "end": 25, "length": 13, "confidence": 0.95, "source": "regex", "token": "[EMAIL_0]"},
    {"category": "SSN", "start": 27, "end": 38, "length": 11, "confidence": 0.95, "source": "regex", "token": "[SSN_0]"}
  ],
  "latency_ms": 4.2,
  "prev_hash": "sha256:7c4d2e...",
  "entry_hash": "sha256:b5e8f3..."
}

Chain verification:

python -m cloakllm verify ./cloakllm_audit/
# โœ… Audit chain integrity verified โ€” no tampering detected.

If anyone modifies a single entry, every subsequent hash breaks:

Entry #40 โœ… โ†’ #41 โœ… โ†’ #42 โŒ TAMPERED โ†’ #43 โŒ BROKEN โ†’ ...

This is what EU AI Act Article 12 requires.

โš™๏ธ Configuration

from cloakllm import Shield, ShieldConfig

shield = Shield(config=ShieldConfig(
    # Detection
    spacy_model="en_core_web_lg",       # Larger model = better accuracy
    detect_emails=True,
    detect_phones=True,
    detect_api_keys=True,
    custom_patterns=[                    # Your own regex patterns
        ("PROJECT_CODE", r"PRJ-\d{4}-\w+"),
        ("INTERNAL_ID", r"EMP-\d{6}"),
    ],

    # Audit
    log_dir="./compliance_audit",
    log_original_values=False,           # Never log original PII

    # Middleware
    skip_models=["ollama/", "local/"],   # Don't cloak local model calls
))

LLM Detection (opt-in) โ€” uses a local Ollama instance to catch semantic PII (addresses, medical info, etc.):

shield = Shield(config=ShieldConfig(
    llm_detection=True,                  # Enable LLM-based detection
    llm_model="llama3.2",               # Ollama model to use
    llm_ollama_url="http://localhost:11434",  # Ollama endpoint
    llm_timeout=10.0,                   # Timeout in seconds
    llm_confidence=0.85,                # Confidence score for LLM detections
))

Environment variables:

CLOAKLLM_LOG_DIR=./audit
CLOAKLLM_SPACY_MODEL=en_core_web_sm
CLOAKLLM_OTEL_ENABLED=true
CLOAKLLM_LLM_DETECTION=true
CLOAKLLM_LLM_MODEL=llama3.2
CLOAKLLM_OLLAMA_URL=http://localhost:11434

๐Ÿ” What Gets Detected

Category Examples Method
PERSON John Smith, Sarah Johnson spaCy NER
ORG Acme Corp, Google spaCy NER
GPE New York, Israel spaCy NER
EMAIL john@acme.com Regex
PHONE +1-555-0142, 050-123-4567 Regex
SSN 123-45-6789 Regex
CREDIT_CARD 4111111111111111 Regex
IP_ADDRESS 192.168.1.100 Regex
API_KEY sk-abc123..., AKIA... Regex
IBAN DE89370400440532013000 Regex
JWT eyJhbGciOi... Regex
Custom Your patterns Regex
ADDRESS 742 Evergreen Terrace LLM (Local)
DATE_OF_BIRTH 1990-01-15 LLM (Local)
MEDICAL diabetes mellitus LLM (Local)
FINANCIAL account 4521-XXX LLM (Local)
NATIONAL_ID TZ 12345678 LLM (Local)
BIOMETRIC fingerprint hash LLM (Local)
USERNAME @johndoe42 LLM (Local)
PASSWORD P@ssw0rd123 LLM (Local)
VEHICLE plate ABC-1234 LLM (Local)

๐Ÿ—บ๏ธ Roadmap

  • PII detection (NER + regex)
  • Deterministic tokenization
  • Hash-chain audit logging
  • LiteLLM middleware integration
  • OpenAI SDK middleware integration
  • CLI tool
  • Redaction / scrubbing mode
  • Field-level PII metadata (entity_details)
  • OpenTelemetry span emission (with auto-redaction)
  • RFC 3161 trusted timestamping
  • Signed audit snapshots
  • MCP security gateway (tool validation, permission enforcement)
  • Local LLM detection (opt-in, via Ollama)
  • Sensitivity-based routing (PII โ†’ local model, general โ†’ cloud)
  • Admin dashboard
  • EU AI Act conformity report generator

๐Ÿ“œ License

MIT

๐Ÿค Contributing

PRs welcome. Highest-impact areas:

  1. Non-English NER โ€” Hebrew, Arabic, Chinese PII detection
  2. De-tokenization accuracy โ€” handling LLM paraphrasing
  3. OpenTelemetry integration โ€” GenAI semantic conventions
  4. MCP security โ€” tool validation middleware

Built for the EU AI Act deadline. Ships before the auditors do.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloakllm-0.1.9.tar.gz (45.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cloakllm-0.1.9-py3-none-any.whl (30.7 kB view details)

Uploaded Python 3

File details

Details for the file cloakllm-0.1.9.tar.gz.

File metadata

  • Download URL: cloakllm-0.1.9.tar.gz
  • Upload date:
  • Size: 45.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for cloakllm-0.1.9.tar.gz
Algorithm Hash digest
SHA256 f66452e7abd920c5dad61fc98fa9636b90b489eb678f0f2e13129e8a678a1e45
MD5 1c7023240c4507023b6dc5806450ccb4
BLAKE2b-256 1cd59b75b548b0617bb005af565cbaa56fdb4c0ecd11ab271a2855b88af67ba0

See more details on using hashes here.

File details

Details for the file cloakllm-0.1.9-py3-none-any.whl.

File metadata

  • Download URL: cloakllm-0.1.9-py3-none-any.whl
  • Upload date:
  • Size: 30.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for cloakllm-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 4c227d71b5f8628c935d33968116737ba9e17df92575257d419e0f7e1fe79224
MD5 5d5c942a37597a5edff1c91f91e4c382
BLAKE2b-256 4fc115022d449e9dc46073e89c8fd1fa174a640c6498e5d5f8e6fe607165d68f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page