Skip to main content

AI Compliance Middleware โ€” PII protection, tamper-evident audit logs, and EU AI Act compliance for LLM gateways

Project description

๐Ÿ›ก๏ธ CloakLLM

Cloak your prompts. Prove your compliance.

Every prompt you send to an LLM provider is visible in plaintext โ€” names, emails, SSNs, API keys, medical records. CloakLLM intercepts, cloaks, and audits every call.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Your App   โ”‚โ”€โ”€โ”€โ–ถโ”‚     CLOAKLLM        โ”‚โ”€โ”€โ”€โ–ถโ”‚  Claude/GPT  โ”‚
โ”‚              โ”‚    โ”‚                     โ”‚     โ”‚  /Gemini     โ”‚
โ”‚  "Email      โ”‚    โ”‚  "Email [PERSON_0]  โ”‚     โ”‚              โ”‚
โ”‚   john@..."  โ”‚    โ”‚   [EMAIL_0]..."     โ”‚     โ”‚  Never sees  โ”‚
โ”‚              โ”‚โ—€โ”€โ”€โ”€โ”‚                     โ”‚โ—€โ”€โ”€โ”€โ”‚  real data   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                          โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚  Hash-Chain  โ”‚
                    โ”‚  Audit Log   โ”‚
                    โ”‚  (EU AI Act) โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
> **Also available for JavaScript/TypeScript:** `npm install cloakllm` โ€” zero dependencies, OpenAI SDK integration. See [CloakLLM JS](https://github.com/cloakllm/CloakLLM-JS). | [Project Hub](https://github.com/cloakllm/CloakLLM)

โฐ Why Now?

EU AI Act enforcement begins August 2, 2026. Article 12 requires tamper-evident audit logs that regulators can mathematically verify. Non-compliance: up to 7% of global annual revenue.

Your current logging (logger.info()) won't survive an audit. CloakLLM provides:

  • ๐Ÿ”’ PII Detection โ€” Names, emails, SSNs, API keys, IPs, credit cards, IBANs via NER + regex
  • ๐ŸŽญ Context-Preserving Cloaking โ€” John Smith โ†’ [PERSON_0] (the LLM still understands the prompt)
  • โ›“๏ธ Tamper-Evident Audit Chain โ€” Every event hash-linked. Any tampering breaks the chain.
  • โšก One-Line Middleware โ€” Drop-in protection for OpenAI SDK and LiteLLM (100+ providers)

๐Ÿš€ Quick Start

Install

pip install cloakllm                  # standalone usage
pip install cloakllm[litellm]         # with LiteLLM integration
python -m spacy download en_core_web_sm

Option A: With OpenAI SDK (one line)

from cloakllm import enable_openai
from openai import OpenAI

client = OpenAI()
enable_openai(client)  # Done. All calls are now cloaked.

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Email john@acme.com about Project X"}]
)
# Provider never sees "john@acme.com" โ€” only "[EMAIL_0]"
# Response is automatically uncloaked before you see it

Option B: With LiteLLM (one line)

import cloakllm
cloakllm.enable()  # Done. All LiteLLM calls are now cloaked.

import litellm
response = litellm.completion(
    model="anthropic/claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Email john@acme.com about Project X"}]
)
# Provider never sees "john@acme.com" โ€” only "[EMAIL_0]"
# Response is automatically uncloaked before you see it

Option C: Standalone

from cloakllm import Shield

shield = Shield()

# Cloak
cloaked, token_map = shield.sanitize(
    "Send report to john@acme.com, SSN 123-45-6789"
)
# cloaked: "Send report to [EMAIL_0], SSN [SSN_0]"

# ... send cloaked prompt to any LLM ...

# Uncloak response
clean = shield.desanitize(llm_response, token_map)

Redaction Mode (irreversible)

from cloakllm import Shield, ShieldConfig

shield = Shield(ShieldConfig(mode="redact"))
redacted, _ = shield.sanitize("Email john@acme.com about Sarah Johnson")
# redacted: "Email [EMAIL_REDACTED] about [PERSON_REDACTED]"
# No token map stored โ€” cannot be reversed

Entity Details (compliance metadata)

from cloakllm import Shield

shield = Shield()
sanitized, token_map = shield.sanitize("Email john@acme.com, SSN 123-45-6789")

# Per-entity metadata (no original text โ€” PII-safe)
token_map.entity_details
# [
#   {"category": "EMAIL", "start": 6, "end": 19, "length": 13, "confidence": 0.95, "source": "regex", "token": "[EMAIL_0]"},
#   {"category": "SSN", "start": 25, "end": 36, "length": 11, "confidence": 0.95, "source": "regex", "token": "[SSN_0]"}
# ]

# Full report for dashboards
token_map.to_report()
# {"entity_count": 2, "categories": {...}, "tokens": [...], "mode": "tokenize", "entity_details": [...]}

Option D: CLI

# Scan text for sensitive data
python -m cloakllm scan "Email john@acme.com, SSN 123-45-6789"

# Verify audit chain integrity
python -m cloakllm verify ./cloakllm_audit/

# View audit statistics
python -m cloakllm stats ./cloakllm_audit/

โ›“๏ธ Tamper-Evident Audit Chain

Every cloaking event is recorded in a hash-chained append-only log:

{
  "seq": 42,
  "event_id": "a1b2c3d4-...",
  "timestamp": "2026-02-27T14:30:00+00:00",
  "event_type": "sanitize",
  "model": "claude-sonnet-4-20250514",
  "entity_count": 3,
  "categories": {"PERSON": 1, "EMAIL": 1, "SSN": 1},
  "tokens_used": ["[PERSON_0]", "[EMAIL_0]", "[SSN_0]"],
  "prompt_hash": "sha256:9f86d0...",
  "sanitized_hash": "sha256:a3f2b1...",
  "entity_details": [
    {"category": "PERSON", "start": 0, "end": 10, "length": 10, "confidence": 0.85, "source": "spacy", "token": "[PERSON_0]"},
    {"category": "EMAIL", "start": 12, "end": 25, "length": 13, "confidence": 0.95, "source": "regex", "token": "[EMAIL_0]"},
    {"category": "SSN", "start": 27, "end": 38, "length": 11, "confidence": 0.95, "source": "regex", "token": "[SSN_0]"}
  ],
  "latency_ms": 4.2,
  "prev_hash": "sha256:7c4d2e...",
  "entry_hash": "sha256:b5e8f3..."
}

Chain verification:

python -m cloakllm verify ./cloakllm_audit/
# โœ… Audit chain integrity verified โ€” no tampering detected.

If anyone modifies a single entry, every subsequent hash breaks:

Entry #40 โœ… โ†’ #41 โœ… โ†’ #42 โŒ TAMPERED โ†’ #43 โŒ BROKEN โ†’ ...

This is what EU AI Act Article 12 requires.

โš™๏ธ Configuration

from cloakllm import Shield, ShieldConfig

shield = Shield(config=ShieldConfig(
    # Detection
    spacy_model="en_core_web_lg",       # Larger model = better accuracy
    detect_emails=True,
    detect_phones=True,
    detect_api_keys=True,
    custom_patterns=[                    # Your own regex patterns
        ("PROJECT_CODE", r"PRJ-\d{4}-\w+"),
        ("INTERNAL_ID", r"EMP-\d{6}"),
    ],

    # Audit
    log_dir="./compliance_audit",
    log_original_values=False,           # Never log original PII

    # Middleware
    skip_models=["ollama/", "local/"],   # Don't cloak local model calls
))

LLM Detection (opt-in) โ€” uses a local Ollama instance to catch semantic PII (addresses, medical info, etc.):

shield = Shield(config=ShieldConfig(
    llm_detection=True,                  # Enable LLM-based detection
    llm_model="llama3.2",               # Ollama model to use
    llm_ollama_url="http://localhost:11434",  # Ollama endpoint
    llm_timeout=10.0,                   # Timeout in seconds
    llm_confidence=0.85,                # Confidence score for LLM detections
))

Environment variables:

CLOAKLLM_LOG_DIR=./audit
CLOAKLLM_SPACY_MODEL=en_core_web_sm
CLOAKLLM_OTEL_ENABLED=true
CLOAKLLM_LLM_DETECTION=true
CLOAKLLM_LLM_MODEL=llama3.2
CLOAKLLM_OLLAMA_URL=http://localhost:11434

๐Ÿ” What Gets Detected

Category Examples Method
PERSON John Smith, Sarah Johnson spaCy NER
ORG Acme Corp, Google spaCy NER
GPE New York, Israel spaCy NER
EMAIL john@acme.com Regex
PHONE +1-555-0142, 050-123-4567 Regex
SSN 123-45-6789 Regex
CREDIT_CARD 4111111111111111 Regex
IP_ADDRESS 192.168.1.100 Regex
API_KEY sk-abc123..., AKIA... Regex
IBAN DE89370400440532013000 Regex
JWT eyJhbGciOi... Regex
Custom Your patterns Regex
ADDRESS 742 Evergreen Terrace LLM (Local)
DATE_OF_BIRTH 1990-01-15 LLM (Local)
MEDICAL diabetes mellitus LLM (Local)
FINANCIAL account 4521-XXX LLM (Local)
NATIONAL_ID TZ 12345678 LLM (Local)
BIOMETRIC fingerprint hash LLM (Local)
USERNAME @johndoe42 LLM (Local)
PASSWORD P@ssw0rd123 LLM (Local)
VEHICLE plate ABC-1234 LLM (Local)

๐Ÿ—บ๏ธ Roadmap

  • PII detection (NER + regex)
  • Deterministic tokenization
  • Hash-chain audit logging
  • LiteLLM middleware integration
  • OpenAI SDK middleware integration
  • CLI tool
  • Redaction / scrubbing mode
  • Field-level PII metadata (entity_details)
  • OpenTelemetry span emission (with auto-redaction)
  • RFC 3161 trusted timestamping
  • Signed audit snapshots
  • MCP security gateway (tool validation, permission enforcement)
  • Local LLM detection (opt-in, via Ollama)
  • Sensitivity-based routing (PII โ†’ local model, general โ†’ cloud)
  • Admin dashboard
  • EU AI Act conformity report generator

๐Ÿ“œ License

MIT

๐Ÿค Contributing

PRs welcome. Highest-impact areas:

  1. Non-English NER โ€” Hebrew, Arabic, Chinese PII detection
  2. De-tokenization accuracy โ€” handling LLM paraphrasing
  3. OpenTelemetry integration โ€” GenAI semantic conventions
  4. MCP security โ€” tool validation middleware

Built for the EU AI Act deadline. Ships before the auditors do.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloakllm-0.2.3.tar.gz (49.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cloakllm-0.2.3-py3-none-any.whl (32.0 kB view details)

Uploaded Python 3

File details

Details for the file cloakllm-0.2.3.tar.gz.

File metadata

  • Download URL: cloakllm-0.2.3.tar.gz
  • Upload date:
  • Size: 49.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for cloakllm-0.2.3.tar.gz
Algorithm Hash digest
SHA256 649f5c05ec8e898797eea18d47ac6b95da1f18458ca649716d229b5f352e89e3
MD5 9c2b0fbae7bff5ef75ce430fe7fe93e0
BLAKE2b-256 253aa2f601fe23dca663ebba16ed9747328d4e2086d05f4084a482ab7670fced

See more details on using hashes here.

File details

Details for the file cloakllm-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: cloakllm-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 32.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for cloakllm-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2311a02ed3c949da9703d8508d08726c1550b63d9fd475d588b00e842830e8bd
MD5 eeb1f819398fa90eaeef1572aa7fee0f
BLAKE2b-256 37bdc33b6c92efe7d0b2aa7b5cecba46d6653d6ae7573495ab981690b77d2aa3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page