Skip to main content

AI Compliance Middleware โ€” PII protection, tamper-evident audit logs, and EU AI Act compliance for LLM gateways

Project description

๐Ÿ›ก๏ธ CloakLLM

Cloak your prompts. Prove your compliance.

Every prompt you send to an LLM provider is visible in plaintext โ€” names, emails, SSNs, API keys, medical records. CloakLLM intercepts, cloaks, and audits every call.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Your App   โ”‚โ”€โ”€โ”€โ–ถโ”‚     CLOAKLLM        โ”‚โ”€โ”€โ”€โ–ถโ”‚  Claude/GPT  โ”‚
โ”‚              โ”‚    โ”‚                     โ”‚     โ”‚  /Gemini     โ”‚
โ”‚  "Email      โ”‚    โ”‚  "Email [PERSON_0]  โ”‚     โ”‚              โ”‚
โ”‚   john@..."  โ”‚    โ”‚   [EMAIL_0]..."     โ”‚     โ”‚  Never sees  โ”‚
โ”‚              โ”‚โ—€โ”€โ”€โ”€โ”‚                     โ”‚โ—€โ”€โ”€โ”€โ”‚  real data   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                          โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  Hash-Chain  โ”‚
                    โ”‚  Audit Log   โ”‚
                    โ”‚  (EU AI Act) โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
> **Also available for JavaScript/TypeScript:** `npm install cloakllm` โ€” zero dependencies, OpenAI SDK integration. See [CloakLLM JS](https://github.com/cloakllm/CloakLLM-JS). | [Project Hub](https://github.com/cloakllm/CloakLLM)

โฐ Why Now?

EU AI Act enforcement begins August 2, 2026. Article 12 requires tamper-evident audit logs that regulators can mathematically verify. Non-compliance: up to 7% of global annual revenue.

Your current logging (logger.info()) won't survive an audit. CloakLLM provides:

  • ๐Ÿ”’ PII Detection โ€” Names, emails, SSNs, API keys, IPs, credit cards, IBANs via NER + regex
  • ๐ŸŽญ Context-Preserving Cloaking โ€” John Smith โ†’ [PERSON_0] (the LLM still understands the prompt)
  • โ›“๏ธ Tamper-Evident Audit Chain โ€” Every event hash-linked. Any tampering breaks the chain.
  • โšก One-Line Middleware โ€” Drop-in protection for OpenAI SDK and LiteLLM (100+ providers)

๐Ÿš€ Quick Start

Install

pip install cloakllm                  # standalone usage
pip install cloakllm[litellm]         # with LiteLLM integration
python -m spacy download en_core_web_sm

Option A: With OpenAI SDK (one line)

from cloakllm import enable_openai
from openai import OpenAI

client = OpenAI()
enable_openai(client)  # Done. All calls are now cloaked.

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Email john@acme.com about Project X"}]
)
# Provider never sees "john@acme.com" โ€” only "[EMAIL_0]"
# Response is automatically uncloaked before you see it

Option B: With LiteLLM (one line)

import cloakllm
cloakllm.enable()  # Done. All LiteLLM calls are now cloaked.

import litellm
response = litellm.completion(
    model="anthropic/claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Email john@acme.com about Project X"}]
)
# Provider never sees "john@acme.com" โ€” only "[EMAIL_0]"
# Response is automatically uncloaked before you see it

Option C: Standalone

from cloakllm import Shield

shield = Shield()

# Cloak
cloaked, token_map = shield.sanitize(
    "Send report to john@acme.com, SSN 123-45-6789"
)
# cloaked: "Send report to [EMAIL_0], SSN [SSN_0]"

# ... send cloaked prompt to any LLM ...

# Uncloak response
clean = shield.desanitize(llm_response, token_map)

Redaction Mode (irreversible)

from cloakllm import Shield, ShieldConfig

shield = Shield(ShieldConfig(mode="redact"))
redacted, _ = shield.sanitize("Email john@acme.com about Sarah Johnson")
# redacted: "Email [EMAIL_REDACTED] about [PERSON_REDACTED]"
# No token map stored โ€” cannot be reversed

Entity Details (compliance metadata)

from cloakllm import Shield

shield = Shield()
sanitized, token_map = shield.sanitize("Email john@acme.com, SSN 123-45-6789")

# Per-entity metadata (no original text โ€” PII-safe)
token_map.entity_details
# [
#   {"category": "EMAIL", "start": 6, "end": 19, "length": 13, "confidence": 0.95, "source": "regex", "token": "[EMAIL_0]"},
#   {"category": "SSN", "start": 25, "end": 36, "length": 11, "confidence": 0.95, "source": "regex", "token": "[SSN_0]"}
# ]

# Full report for dashboards
token_map.to_report()
# {"entity_count": 2, "categories": {...}, "tokens": [...], "mode": "tokenize", "entity_details": [...]}

Option D: CLI

# Scan text for sensitive data
python -m cloakllm scan "Email john@acme.com, SSN 123-45-6789"

# Verify audit chain integrity
python -m cloakllm verify ./cloakllm_audit/

# View audit statistics
python -m cloakllm stats ./cloakllm_audit/

โ›“๏ธ Tamper-Evident Audit Chain

Every cloaking event is recorded in a hash-chained append-only log:

{
  "seq": 42,
  "event_id": "a1b2c3d4-...",
  "timestamp": "2026-02-27T14:30:00+00:00",
  "event_type": "sanitize",
  "model": "claude-sonnet-4-20250514",
  "entity_count": 3,
  "categories": {"PERSON": 1, "EMAIL": 1, "SSN": 1},
  "tokens_used": ["[PERSON_0]", "[EMAIL_0]", "[SSN_0]"],
  "prompt_hash": "sha256:9f86d0...",
  "sanitized_hash": "sha256:a3f2b1...",
  "entity_details": [
    {"category": "PERSON", "start": 0, "end": 10, "length": 10, "confidence": 0.85, "source": "spacy", "token": "[PERSON_0]"},
    {"category": "EMAIL", "start": 12, "end": 25, "length": 13, "confidence": 0.95, "source": "regex", "token": "[EMAIL_0]"},
    {"category": "SSN", "start": 27, "end": 38, "length": 11, "confidence": 0.95, "source": "regex", "token": "[SSN_0]"}
  ],
  "latency_ms": 4.2,
  "prev_hash": "sha256:7c4d2e...",
  "entry_hash": "sha256:b5e8f3..."
}

Chain verification:

python -m cloakllm verify ./cloakllm_audit/
# โœ… Audit chain integrity verified โ€” no tampering detected.

If anyone modifies a single entry, every subsequent hash breaks:

Entry #40 โœ… โ†’ #41 โœ… โ†’ #42 โŒ TAMPERED โ†’ #43 โŒ BROKEN โ†’ ...

This is what EU AI Act Article 12 requires.

โš™๏ธ Configuration

from cloakllm import Shield, ShieldConfig

shield = Shield(config=ShieldConfig(
    # Detection
    spacy_model="en_core_web_lg",       # Larger model = better accuracy
    detect_emails=True,
    detect_phones=True,
    detect_api_keys=True,
    custom_patterns=[                    # Your own regex patterns
        ("PROJECT_CODE", r"PRJ-\d{4}-\w+"),
        ("INTERNAL_ID", r"EMP-\d{6}"),
    ],

    # Audit
    log_dir="./compliance_audit",
    log_original_values=False,           # Never log original PII

    # Middleware
    skip_models=["ollama/", "local/"],   # Don't cloak local model calls
))

LLM Detection (opt-in) โ€” uses a local Ollama instance to catch semantic PII (addresses, medical info, etc.):

shield = Shield(config=ShieldConfig(
    llm_detection=True,                  # Enable LLM-based detection
    llm_model="llama3.2",               # Ollama model to use
    llm_ollama_url="http://localhost:11434",  # Ollama endpoint
    llm_timeout=10.0,                   # Timeout in seconds
    llm_confidence=0.85,                # Confidence score for LLM detections
))

Environment variables:

CLOAKLLM_LOG_DIR=./audit
CLOAKLLM_SPACY_MODEL=en_core_web_sm
CLOAKLLM_OTEL_ENABLED=true
CLOAKLLM_LLM_DETECTION=true
CLOAKLLM_LLM_MODEL=llama3.2
CLOAKLLM_OLLAMA_URL=http://localhost:11434

๐Ÿ” What Gets Detected

Category Examples Method
PERSON John Smith, Sarah Johnson spaCy NER
ORG Acme Corp, Google spaCy NER
GPE New York, Israel spaCy NER
EMAIL john@acme.com Regex
PHONE +1-555-0142, 050-123-4567 Regex
SSN 123-45-6789 Regex
CREDIT_CARD 4111111111111111 Regex
IP_ADDRESS 192.168.1.100 Regex
API_KEY sk-abc123..., AKIA... Regex
IBAN DE89370400440532013000 Regex
JWT eyJhbGciOi... Regex
Custom Your patterns Regex
ADDRESS 742 Evergreen Terrace LLM (Local)
DATE_OF_BIRTH 1990-01-15 LLM (Local)
MEDICAL diabetes mellitus LLM (Local)
FINANCIAL account 4521-XXX LLM (Local)
NATIONAL_ID TZ 12345678 LLM (Local)
BIOMETRIC fingerprint hash LLM (Local)
USERNAME @johndoe42 LLM (Local)
PASSWORD P@ssw0rd123 LLM (Local)
VEHICLE plate ABC-1234 LLM (Local)

๐Ÿ—บ๏ธ Roadmap

  • PII detection (NER + regex)
  • Deterministic tokenization
  • Hash-chain audit logging
  • LiteLLM middleware integration
  • OpenAI SDK middleware integration
  • CLI tool
  • Redaction / scrubbing mode
  • Field-level PII metadata (entity_details)
  • OpenTelemetry span emission (with auto-redaction)
  • RFC 3161 trusted timestamping
  • Signed audit snapshots
  • MCP security gateway (tool validation, permission enforcement)
  • Local LLM detection (opt-in, via Ollama)
  • Sensitivity-based routing (PII โ†’ local model, general โ†’ cloud)
  • Admin dashboard
  • EU AI Act conformity report generator

๐Ÿ“œ License

MIT

๐Ÿค Contributing

PRs welcome. Highest-impact areas:

  1. Non-English NER โ€” Hebrew, Arabic, Chinese PII detection
  2. De-tokenization accuracy โ€” handling LLM paraphrasing
  3. OpenTelemetry integration โ€” GenAI semantic conventions
  4. MCP security โ€” tool validation middleware

Built for the EU AI Act deadline. Ships before the auditors do.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloakllm-0.3.0.tar.gz (53.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cloakllm-0.3.0-py3-none-any.whl (33.5 kB view details)

Uploaded Python 3

File details

Details for the file cloakllm-0.3.0.tar.gz.

File metadata

  • Download URL: cloakllm-0.3.0.tar.gz
  • Upload date:
  • Size: 53.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for cloakllm-0.3.0.tar.gz
Algorithm Hash digest
SHA256 b5f4354afe7661e0ae024fd07eb47eec2d1d63a18f54373bff782ff3bf7568d2
MD5 ea114e062d7476f8c0383e5dfebd1931
BLAKE2b-256 9b2e6ad36a803552b68863a29d5f6e5c680cb1e72207acfb20e3b10a724c080f

See more details on using hashes here.

File details

Details for the file cloakllm-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: cloakllm-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 33.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for cloakllm-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7b1652ee82e6c44708f4d60b70f52b3e04a511d736a27f317a8c6e545b0293d9
MD5 4f0cee1195603d3e71e58ad17b40e8be
BLAKE2b-256 ef2f6db2556ccda160b598dc6ffa21fd69a691ce9af0d9e90acc47b15594f161

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page