Skip to main content

Local privacy kernel for frontier LLMs — Gemma 4 E2B detects, redacts, and restores PII before any byte crosses the wire.

Project description

CloakBot

CloakBot — A Local Privacy Kernel for Frontier LLMs

Use Claude / GPT / Gemini without your data ever leaving your laptop.

Pre-wire Enforcement Gemma 4 Trust Layer Remote LLM Claude GPT Gemini Python 3.11+ MIT License

English | 简体中文

Built on nanobot · Submitted to the Gemma 4 Good Hackathon (Kaggle, May 2026)


📋 TL;DR

Frontier LLM use is now load-bearing — but the data that crosses the wire is non-revocable. CloakBot moves enforcement before the wire: a local privacy kernel on Gemma 4 E2B that detects sensitive spans, assigns stable typed placeholders, redacts images, chunks long documents, and restores outputs locally from a per-session vault. The remote LLM is interchangeable — Claude, GPT, and Gemini all accept the sanitised stream unchanged.

2,872 entity-test instances of receipts across three leak-eval layers — 7.98% pair leak (text) · 1.11% span leak (visual) · 6.26% pair leak (long-document) · 97.14% cross-turn alias consistency.


🔍 How it works

User message (text + optional images / documents)
  └─► [ pre_llm_hook → PrivacyRuntime ]
        • Local Gemma 4 E2B detectors run concurrently (general + digit)
        • Images → OCR + visual bbox redaction + placeholder overlay
        • Long documents → content-aware chunker + per-chunk detection + vault coalesce
        • Tool I/O → severity-gated approval for non-local tools
        • Sensitive spans rewritten into <<TYPE_N>> placeholders, stored in the per-session Vault
  └─► [ Remote LLM ]   (Claude / GPT / Gemini — sanitised payload only)
        • Math turns: remote model emits <python_snippet_N>; real arithmetic happens locally
        • Tool calls: arguments restored locally, outputs re-sanitised before reuse
  └─► [ post_llm_hook → local restoration ]
        • Placeholder restoration via vault map
        • Per-turn transparency report
  └─► User sees original values in the final reply

Streaming output is buffered until restoration completes — the user never sees raw placeholders.


What gets detected

Category Examples Severity
Personal & contact Names, phone, email, address High
Unique identifiers SSN, passport, account, license plate High
Secrets & access Passwords, API keys, tokens, private URLs High
Organisation & network Company names, school names, IPs High
Medical & narrative PHI, treatments, diagnoses, code names High
Numeric & temporal Money, dates, percentages, counts, measurements, coordinates High

Detection is split into GeneralPrivacyDetector (non-computable text spans) and DigitPrivacyDetector (numeric/temporal values normalised for later local math). Each span becomes an indexed token — <<ENTITY_TYPE_INDEX>> — so the remote LLM can still track relationships (PERSON_1PERSON_2) without knowing who they are: e.g. Alice Chen → <<PERSON_1>>, 555-123-4567 → <<PHONE_1>>, $142,500 → <<FINANCE_1>>, Metformin 500mg → <<MEDICAL_1>>.


Why a small LLM, not regex or BERT-NER?

TL;DR — regex catches the easy 20%; the other 80% needs context. CloakBot uses both: regex on the fast path (emails, invoice numbers, transaction IDs, file paths — in privacy/core/detection/), and Gemma 4 E2B for everything regex and BERT-NER cannot do.

Failure mode Regex BERT-NER (Presidio, spaCy) Gemma 4 E2B
Known formats — email, SSN, credit card
Disambiguate "John" as a placeholder vs a real customer
Instructional numbers — "give me 3 bullet points about Q4 earnings" tokenizes 3 (breaks the request) varies by tag set ✓ kept as task structure
Combination identifiers — "67-year-old male diabetic in ZIP 90210"
Cross-turn entity disambiguation — "someone else surnamed Lin" ≠ existing <<PERSON_1>> Lin Zhiyuan n/a n/a ✓ emits new
Indirect identifiers — "the patient I mentioned earlier"
User-defined entities — "also redact our codename Falcon" edit regex retrain edit prompt
Domain shift — chat logs vs news-trained NER n/a recall drops 20–40% resilient
Multilingual (CN / JP / KR / EN) on one model one regex set per locale 600 MB+ per language one 2B model
Computable normalization — $1,200.501200.5 (ready for local math) string-only string-only ✓ typed numeric

A PII proxy that only catches the easy stuff is strictly worse than no proxy — users trust it. The real bar is reasoning about whether a token should be redacted in this specific conversation, a generative-LLM-shaped problem. Gemma 4 E2B is the one commercially-redistributable model that fits consumer hardware (~5 GB quantised, runs on a MacBook via Ollama), returns parseable JSON at T=0, and is multimodal and multilingual in one weight set — the trust layer is the model, not a chat rewriter bolted onto Presidio. The honest cost: ~50–200 ms per detector call vs regex's <1 ms, mitigated by concurrent general+digit detectors, regex on the fast path, and per-chunk concurrency. Full rationale, latency, and methodology in the hackathon writeup.


Multi-agent architecture

┌─────────────────────────────── LOCAL TRUST ZONE ─────────────────────────────┐
│                                                                              │
│   User input  ──►  [ pre_llm_hook ]  ──►  [ PrivacyRuntime ]                 │
│                                                  │                           │
│           ┌──────────────────┬───────────────────┼─────────────────┐         │
│           ▼                  ▼                   ▼                 ▼         │
│      PiiDetector       ToolPrivacy        VisualPrivacy        DocChunker    │
│   (general + digit)    Interceptor          Pipeline          (long docs)    │
│           │            (tool I/O)        (OCR + bbox)              │         │
│           └──────────────────┬───────────────────────┬─────────────┘         │
│                              ▼                       ▼                       │
│                  [ Session Vault ]         [ Local Math Executor ]           │
│              (placeholder ↔ raw map,         (arithmetic-only AST,           │
│               per-session, on disk)            sandboxed)                    │
│                                                                              │
└──────────────────────────────────┬───────────────────────────────────────────┘
                                   │  sanitised payload only
   ────────────────────────────────┼─────────── REMOTE BOUNDARY ───────────────
                                   ▼
                          [ Remote LLM ]   (Claude / GPT / Gemini)
                                   │  response re-enters local zone
   ────────────────────────────────┼───────────────────────────────────────────
                                   ▼
                         [ post_llm_hook ]  →  restore + per-turn report  →  User
Component Role Backend
PrivacyRuntime Per-turn coordinator: sanitise, route, restore, audit Python
PiiDetector General + digit + intent detectors run concurrently, then deduplicated Gemma 4 E2B
ToolPrivacyInterceptor (+ chunking/) Tool I/O restoration, severity-gated approval, output sanitisation; long-document content-aware chunkers with cross-chunk vault coalesce Gemma 4 E2B + rules
VisualPrivacyPipeline OCR + bbox redaction + placeholder overlay + cross-modal recall bridge Gemma 4 E2B + Pillow + Tesseract
Session Vault Audit-traceable placeholder ↔ raw mapping with cross-turn alias reuse JSON on disk
Math Executor Local execution of remote-generated <python_snippet_N> blocks; AST-validated, arithmetic-only Python AST sandbox

Full component list and file tree in docs/ARCHITECTURE.md.


📊 Evals — trust by measurement

We refused to ship trust-by-assertion. Three end-to-end leak eval layers run against the production pipeline and answer one question per run: did any ground-truth identifying token reach the upstream payload?

Layer Coverage Headline Cross-turn alias
A1 — text input 4 domains × 20 sessions × 902 entity-turn pairs 7.98% pair leak · 5.88% token leak 97.14%
A2 — visual 10 invoice seeds × 180 PII spans × 197 redaction boxes 1.11% span leak · 1.01% token leak n/a
A3 — long-document 3 domains × 20 sessions × 1,790 pairs via chunker 6.26% pair leak · 6.63% token leak 93.86%
  • 100% pair recall cross-domain on EMAIL · PHONE · FINANCE · IP · URL
  • MEDICAL recall: 20% → 95% via type-driven prompt iteration (rules → adjacent examples)
  • 0 of 226 A3 seam leaks fall within the 300-char chunker overlap band — every long-doc leak is an intra-chunk detector miss, not a boundary failure

Full per-template breakdown, methodology, and self-caught eval bugs in docs/HACKATHON_WRITEUP.md. Reproducibility: one command per layer in tests/eval/runners/.

All p95 latency numbers measured with Gemma 4 E2B served via vLLM on an RTX 5090. The MacBook (Ollama) path is functionally end-to-end but slower — MacBook is the target hardware, not the measurement rig.


🛠️ Setup

Install

From PyPI (recommended) — the WebUI is bundled, so no Node build is needed:

pip install --pre cloakbot      # 0.2.1 beta (privacy kernel); drop --pre once 0.2.1 is stable

Prefer an isolated CLI install? uv tool install --prerelease allow cloakbot or pipx install --pip-args=--pre cloakbot. Optional chat channels: pip install --pre 'cloakbot[matrix,discord,msteams]'.

From source (latest main / development) — needs Node ≥24 to build the WebUI:

git clone https://github.com/spire-studio/cloakbot.git && cd cloakbot
uv sync
cd webui && npm install && npm run build && cd ..

From a source checkout, prefix the cloakbot commands below with uv run.

1. Bring your own local Gemma 4 detector

CloakBot's privacy guarantee depends on the PII detector running on a host you control. Any OpenAI-compatible server works (vLLM, Ollama, llama.cpp, …); we recommend Gemma 4 E2B. Pointing the detector at a remote endpoint (e.g. OpenRouter) ships raw, unsanitised input there for detection — that's TEST-ONLY and defeats the boundary.

Ollama (macOS / Linux / WSL, no GPU):

curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma4:e2b        # ships the model + an OpenAI-compatible endpoint in one tool
# → base URL http://127.0.0.1:11434/v1 · API key "ollama" (any value) · model gemma4:e2b

vLLM (GPU machine — fast, reproducible; the A1/A2/A3 eval path):

uv sync --extra vllm
uv run huggingface-cli login   # accept the Gemma license at hf.co/google/gemma-4-E2B-it
vllm serve google/gemma-4-E2B-it --port 8000 --dtype bfloat16 --api-key <your-token>
# → base URL http://127.0.0.1:8000/v1 · API key <your-token> · model google/gemma-4-E2B-it

2. Configure & launch

cloakbot onboard      # set the remote LLM + [D] Privacy Detector
cloakbot gateway      # serves the WebUI — open the URL it prints (default http://127.0.0.1:8765/)

Set the detector's base URL / API key / model under onboard[D] Privacy Detector (or later in the WebUI under Settings → Privacy), and the remote LLM (Claude / GPT / Gemini) in the same flow (or Settings → Models). The detector exposes an OpenAI-compatible surface used only for local PII detection — the remote LLM call is entirely separate.

Then drag docs/demo/demo_onboarding_memo.md into the Composer to watch 20 PII entities masked end-to-end — click Diff on any bubble for the Local↔Remote view.


🗺️ Roadmap

✅ Shipped — privacy kernel (0.2.1b1, 2026)

Core privacy runtime — April

  • Split local detectors (general + digit) via Gemma 4 E2B
  • Session Vault with JSON persistence + cross-turn alias reuse
  • Math snippet contract + local AST-validated arithmetic executor
  • IntentAnalyzer + chat/math routing
  • ToolPrivacyInterceptor for tool I/O sanitisation + severity-gated approval

Trust boundary expansion — May

  • ✓ Long-document chunker path (ToolPrivacyDetector + 4 content-aware chunkers: plaintext / JSON / HTML / Markdown)
  • ✓ Visual pipeline: OCR + bbox redaction + placeholder overlay + cross-modal recall bridge
  • ✓ WebUI document upload (text/plain, text/markdown ≤ 64 KB) via the same chunker-backed sanitizer
  • ✓ Local↔Remote diff dialog with per-document entity highlighting
  • ✓ Ollama as a first-class backend (no GPU required) — ollama pull gemma4:e2b ships the model + endpoint in one tool

Trust by measurement — May

  • ✓ End-to-end leak eval harness (tests/eval/runners/)
  • ✓ A1 / A2 / A3 layers — 2,872 entity-test instances of receipts
  • ✓ Type-driven detector prompts (MEDICAL recall 20% → 95%)
  • ✓ Self-caught eval bugs surfaced and fixed (token-level scoring; full-value appearance tightening)

Streaming restoration — June

  • StreamingSanitizer with a carry-over window — streamed tool output (exec / shell / long-task) is sanitised incrementally, never buffered whole
  • ✓ Live placeholder→original restoration as the remote reply streams, with highlight / hover / diff in the WebUI

🚀 Future

  • Domain-specific LoRA adapters — fine-tune Gemma 4 E2B on vertical corpora (healthcare, legal, finance) to lift recall on domain-specific phrases. The same kernel, three adapters: pick by tenant.
  • ORG short / hyphenated name recall (71.67% → 90% target) — the largest remaining A1 gap, addressable with the LoRA path above
  • Bilingual coverage — Chinese-language eval templates + zh-CN detector prompt iteration
  • Per-turn detector batching — Medical p95 6.2 s → < 2 s target
  • Encrypted Vault persistence option for shared-machine deployments
  • Policy-driven severity tiers beyond the current registry defaults (all high today)
  • Dataset / table-specific structured chunker (CSV / Parquet) for analytics tool outputs

Hackathon tracks

  • Main Track — Gemma 4 Good (Safety & Trust direction) — Gemma 4 E2B as a local privacy kernel that enforces a pre-wire boundary before any byte reaches the remote LLM. Backed by 2,872 entity-test instances across A1 (text), A2 (visual), and A3 (long-document) leak evals — see docs/HACKATHON_WRITEUP.md.
  • Ollama Special Technologyollama pull gemma4:e2b ships the model + the OpenAI-compatible endpoint in one tool, then point the detector at http://127.0.0.1:11434/v1. Gemma 4 is the trust layer; Ollama is the deployment layer.

Looking for the code exactly as submitted to the hackathon? main has since been rebased onto upstream nanobot — browse the hackathon-gemma4-2026-05 tag for the submission snapshot.


⭐ Star History

Star History Chart

Credits & license

CloakBot is built on nanobot (MIT License) by HKUDS. The channel integrations, session management, memory system, and CLI come from the upstream framework. CloakBot's privacy-specific work lives primarily under cloakbot/privacy/ (detection, vaulting, egress gates, streaming restoration, and the WebUI privacy surface), the local-detector client cloakbot/providers/detector.py, the detector config (PrivacyDetectorConfig) in cloakbot/config/schema.py, and the runtime seams in cloakbot/agent/loop.py.

Architecture, reliability, security, privacy-domain notes, and design decisions live under docs/ — start with AGENTS.md.

MIT License — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloakbot-0.2.1b1.tar.gz (2.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cloakbot-0.2.1b1-py3-none-any.whl (3.1 MB view details)

Uploaded Python 3

File details

Details for the file cloakbot-0.2.1b1.tar.gz.

File metadata

  • Download URL: cloakbot-0.2.1b1.tar.gz
  • Upload date:
  • Size: 2.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cloakbot-0.2.1b1.tar.gz
Algorithm Hash digest
SHA256 e6d62f38e3369eaca03883b7d116a4a538df279486a5b2649d71d0d548890b42
MD5 c457050d35c894ae7db549a7540d0e36
BLAKE2b-256 9315a4463114b5a29d2d46ccbb7f643e577954785c8e4102d34c9ab044c3a4a3

See more details on using hashes here.

Provenance

The following attestation bundles were made for cloakbot-0.2.1b1.tar.gz:

Publisher: release.yml on spire-studio/cloakbot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file cloakbot-0.2.1b1-py3-none-any.whl.

File metadata

  • Download URL: cloakbot-0.2.1b1-py3-none-any.whl
  • Upload date:
  • Size: 3.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for cloakbot-0.2.1b1-py3-none-any.whl
Algorithm Hash digest
SHA256 e806db09230603b81118cfcc1204ee93c253f80edf1b593064f097d5cc759e9e
MD5 f4612a17fa26265f9f8fccd7ef97ca60
BLAKE2b-256 714319c8b1a4a7ce45c1232e91410325d0bf1f4a1cf250492fbd4bc19e0730a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for cloakbot-0.2.1b1-py3-none-any.whl:

Publisher: release.yml on spire-studio/cloakbot

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page