AI Compliance Middleware โ PII protection, tamper-evident audit logs, and EU AI Act compliance for LLM gateways
Project description
๐ก๏ธ CloakLLM
Cloak your prompts. Prove your compliance.
Every prompt you send to an LLM provider is visible in plaintext โ names, emails, SSNs, API keys, medical records. CloakLLM intercepts, cloaks, and audits every call.
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ Your App โโโโโถโ CLOAKLLM โโโโโถโ Claude/GPT โ
โ โ โ โ โ /Gemini โ
โ "Email โ โ "Email [PERSON_0] โ โ โ
โ john@..." โ โ [EMAIL_0]..." โ โ Never sees โ
โ โโโโโโ โโโโโโ real data โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโ
โ Hash-Chain โ
โ Audit Log โ
โ (EU AI Act) โ
โโโโโโโโโโโโโโโ
> **Also available for JavaScript/TypeScript:** `npm install cloakllm` โ zero dependencies, OpenAI SDK integration. See [CloakLLM JS](https://github.com/cloakllm/CloakLLM-JS). | [Project Hub](https://github.com/cloakllm/CloakLLM)
โฐ Why Now?
EU AI Act enforcement begins August 2, 2026. Article 12 requires tamper-evident audit logs that regulators can mathematically verify. Non-compliance: up to 7% of global annual revenue.
Your current logging (logger.info()) won't survive an audit. CloakLLM provides:
- ๐ PII Detection โ Names, emails, SSNs, API keys, IPs, credit cards, IBANs via NER + regex
- ๐ญ Context-Preserving Cloaking โ
John Smithโ[PERSON_0](the LLM still understands the prompt) - โ๏ธ Tamper-Evident Audit Chain โ Every event hash-linked. Any tampering breaks the chain.
- โก One-Line Middleware โ Drop-in protection for OpenAI SDK and LiteLLM (100+ providers)
๐ Quick Start
Install
pip install cloakllm # standalone usage
pip install cloakllm[litellm] # with LiteLLM integration
python -m spacy download en_core_web_sm
Option A: With OpenAI SDK (one line)
from cloakllm import enable_openai
from openai import OpenAI
client = OpenAI()
enable_openai(client) # Done. All calls are now cloaked.
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Email john@acme.com about Project X"}]
)
# Provider never sees "john@acme.com" โ only "[EMAIL_0]"
# Response is automatically uncloaked before you see it
Option B: With LiteLLM (one line)
import cloakllm
cloakllm.enable() # Done. All LiteLLM calls are now cloaked.
import litellm
response = litellm.completion(
model="anthropic/claude-sonnet-4-20250514",
messages=[{"role": "user", "content": "Email john@acme.com about Project X"}]
)
# Provider never sees "john@acme.com" โ only "[EMAIL_0]"
# Response is automatically uncloaked before you see it
Option C: Standalone
from cloakllm import Shield
shield = Shield()
# Cloak
cloaked, token_map = shield.sanitize(
"Send report to john@acme.com, SSN 123-45-6789"
)
# cloaked: "Send report to [EMAIL_0], SSN [SSN_0]"
# ... send cloaked prompt to any LLM ...
# Uncloak response
clean = shield.desanitize(llm_response, token_map)
Redaction Mode (irreversible)
from cloakllm import Shield, ShieldConfig
shield = Shield(ShieldConfig(mode="redact"))
redacted, _ = shield.sanitize("Email john@acme.com about Sarah Johnson")
# redacted: "Email [EMAIL_REDACTED] about [PERSON_REDACTED]"
# No token map stored โ cannot be reversed
Entity Details (compliance metadata)
from cloakllm import Shield
shield = Shield()
sanitized, token_map = shield.sanitize("Email john@acme.com, SSN 123-45-6789")
# Per-entity metadata (no original text โ PII-safe)
token_map.entity_details
# [
# {"category": "EMAIL", "start": 6, "end": 19, "length": 13, "confidence": 0.95, "source": "regex", "token": "[EMAIL_0]"},
# {"category": "SSN", "start": 25, "end": 36, "length": 11, "confidence": 0.95, "source": "regex", "token": "[SSN_0]"}
# ]
# Full report for dashboards
token_map.to_report()
# {"entity_count": 2, "categories": {...}, "tokens": [...], "mode": "tokenize", "entity_details": [...]}
Option D: CLI
# Scan text for sensitive data
python -m cloakllm scan "Email john@acme.com, SSN 123-45-6789"
# Verify audit chain integrity
python -m cloakllm verify ./cloakllm_audit/
# View audit statistics
python -m cloakllm stats ./cloakllm_audit/
โ๏ธ Tamper-Evident Audit Chain
Every cloaking event is recorded in a hash-chained append-only log:
{
"seq": 42,
"event_id": "a1b2c3d4-...",
"timestamp": "2026-02-27T14:30:00+00:00",
"event_type": "sanitize",
"model": "claude-sonnet-4-20250514",
"entity_count": 3,
"categories": {"PERSON": 1, "EMAIL": 1, "SSN": 1},
"tokens_used": ["[PERSON_0]", "[EMAIL_0]", "[SSN_0]"],
"prompt_hash": "sha256:9f86d0...",
"sanitized_hash": "sha256:a3f2b1...",
"entity_details": [
{"category": "PERSON", "start": 0, "end": 10, "length": 10, "confidence": 0.85, "source": "spacy", "token": "[PERSON_0]"},
{"category": "EMAIL", "start": 12, "end": 25, "length": 13, "confidence": 0.95, "source": "regex", "token": "[EMAIL_0]"},
{"category": "SSN", "start": 27, "end": 38, "length": 11, "confidence": 0.95, "source": "regex", "token": "[SSN_0]"}
],
"latency_ms": 4.2,
"prev_hash": "sha256:7c4d2e...",
"entry_hash": "sha256:b5e8f3..."
}
Chain verification:
python -m cloakllm verify ./cloakllm_audit/
# โ
Audit chain integrity verified โ no tampering detected.
If anyone modifies a single entry, every subsequent hash breaks:
Entry #40 โ
โ #41 โ
โ #42 โ TAMPERED โ #43 โ BROKEN โ ...
This is what EU AI Act Article 12 requires.
โ๏ธ Configuration
from cloakllm import Shield, ShieldConfig
shield = Shield(config=ShieldConfig(
# Detection
spacy_model="en_core_web_lg", # Larger model = better accuracy
detect_emails=True,
detect_phones=True,
detect_api_keys=True,
custom_patterns=[ # Your own regex patterns
("PROJECT_CODE", r"PRJ-\d{4}-\w+"),
("INTERNAL_ID", r"EMP-\d{6}"),
],
# Audit
log_dir="./compliance_audit",
log_original_values=False, # Never log original PII
# Middleware
skip_models=["ollama/", "local/"], # Don't cloak local model calls
))
LLM Detection (opt-in) โ uses a local Ollama instance to catch semantic PII (addresses, medical info, etc.):
shield = Shield(config=ShieldConfig(
llm_detection=True, # Enable LLM-based detection
llm_model="llama3.2", # Ollama model to use
llm_ollama_url="http://localhost:11434", # Ollama endpoint
llm_timeout=10.0, # Timeout in seconds
llm_confidence=0.85, # Confidence score for LLM detections
))
Environment variables:
CLOAKLLM_LOG_DIR=./audit
CLOAKLLM_SPACY_MODEL=en_core_web_sm
CLOAKLLM_OTEL_ENABLED=true
CLOAKLLM_LLM_DETECTION=true
CLOAKLLM_LLM_MODEL=llama3.2
CLOAKLLM_OLLAMA_URL=http://localhost:11434
๐ What Gets Detected
| Category | Examples | Method |
|---|---|---|
PERSON |
John Smith, Sarah Johnson | spaCy NER |
ORG |
Acme Corp, Google | spaCy NER |
GPE |
New York, Israel | spaCy NER |
EMAIL |
john@acme.com | Regex |
PHONE |
+1-555-0142, 050-123-4567 | Regex |
SSN |
123-45-6789 | Regex |
CREDIT_CARD |
4111111111111111 | Regex |
IP_ADDRESS |
192.168.1.100 | Regex |
API_KEY |
sk-abc123..., AKIA... | Regex |
IBAN |
DE89370400440532013000 | Regex |
JWT |
eyJhbGciOi... | Regex |
| Custom | Your patterns | Regex |
ADDRESS |
742 Evergreen Terrace | LLM (Local) |
DATE_OF_BIRTH |
1990-01-15 | LLM (Local) |
MEDICAL |
diabetes mellitus | LLM (Local) |
FINANCIAL |
account 4521-XXX | LLM (Local) |
NATIONAL_ID |
TZ 12345678 | LLM (Local) |
BIOMETRIC |
fingerprint hash | LLM (Local) |
USERNAME |
@johndoe42 | LLM (Local) |
PASSWORD |
P@ssw0rd123 | LLM (Local) |
VEHICLE |
plate ABC-1234 | LLM (Local) |
๐บ๏ธ Roadmap
- PII detection (NER + regex)
- Deterministic tokenization
- Hash-chain audit logging
- LiteLLM middleware integration
- OpenAI SDK middleware integration
- CLI tool
- Redaction / scrubbing mode
- Field-level PII metadata (entity_details)
- OpenTelemetry span emission (with auto-redaction)
- RFC 3161 trusted timestamping
- Signed audit snapshots
- MCP security gateway (tool validation, permission enforcement)
- Local LLM detection (opt-in, via Ollama)
- Sensitivity-based routing (PII โ local model, general โ cloud)
- Admin dashboard
- EU AI Act conformity report generator
๐ License
MIT
๐ค Contributing
PRs welcome. Highest-impact areas:
- Non-English NER โ Hebrew, Arabic, Chinese PII detection
- De-tokenization accuracy โ handling LLM paraphrasing
- OpenTelemetry integration โ GenAI semantic conventions
- MCP security โ tool validation middleware
Built for the EU AI Act deadline. Ships before the auditors do.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cloakllm-0.6.3.tar.gz.
File metadata
- Download URL: cloakllm-0.6.3.tar.gz
- Upload date:
- Size: 150.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d4fd5142156fa76cc2a4227e42770f5f59e65f98f460b6f646367eaeeb28d84
|
|
| MD5 |
b13c4b8e724a132d8ae6e5e9290c407d
|
|
| BLAKE2b-256 |
0559906e8663b985fe879f3677cf309bc564d8225aecaaaf915380e60a42cf6f
|
Provenance
The following attestation bundles were made for cloakllm-0.6.3.tar.gz:
Publisher:
publish.yml on cloakllm/CloakLLM-PY
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cloakllm-0.6.3.tar.gz -
Subject digest:
4d4fd5142156fa76cc2a4227e42770f5f59e65f98f460b6f646367eaeeb28d84 - Sigstore transparency entry: 1340702971
- Sigstore integration time:
-
Permalink:
cloakllm/CloakLLM-PY@f599cda8e4fc5b8ea2b106b9fd2f425451af3cc1 -
Branch / Tag:
refs/tags/v0.6.3 - Owner: https://github.com/cloakllm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f599cda8e4fc5b8ea2b106b9fd2f425451af3cc1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file cloakllm-0.6.3-py3-none-any.whl.
File metadata
- Download URL: cloakllm-0.6.3-py3-none-any.whl
- Upload date:
- Size: 82.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1211c3eded12a6d8da6115fbb8843d4d5d290a62c2034336d3945c0ae4062a7
|
|
| MD5 |
c53963ed93ef534a3e6f338f7a5209bc
|
|
| BLAKE2b-256 |
8ba0895d1d7cd67b747eb0d8434d6cc2d9495dd0c15c19631c6380ab3f67168d
|
Provenance
The following attestation bundles were made for cloakllm-0.6.3-py3-none-any.whl:
Publisher:
publish.yml on cloakllm/CloakLLM-PY
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
cloakllm-0.6.3-py3-none-any.whl -
Subject digest:
d1211c3eded12a6d8da6115fbb8843d4d5d290a62c2034336d3945c0ae4062a7 - Sigstore transparency entry: 1340702972
- Sigstore integration time:
-
Permalink:
cloakllm/CloakLLM-PY@f599cda8e4fc5b8ea2b106b9fd2f425451af3cc1 -
Branch / Tag:
refs/tags/v0.6.3 - Owner: https://github.com/cloakllm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f599cda8e4fc5b8ea2b106b9fd2f425451af3cc1 -
Trigger Event:
push
-
Statement type: