Transparent PII masking for LLM clients — keep sensitive data out of your AI prompts
Project description
PrivacyLens
Transparent PII masking for LLM clients — keep sensitive data out of your AI prompts.
Why?
Every prompt you send to an LLM can leak PII — names, emails, phone numbers, SSNs. PrivacyLens intercepts your prompts, replaces PII with anonymous tokens, and restores the original values when the response comes back. Your LLM never sees real data.
Input: "Email john@example.com about the project"
Sent: "Email [EMAIL_1] about the project" ← LLM sees this
Output: "I've emailed john@example.com" ← Your app sees this
Install
pip install privacylens
Usage
Step 1: Wrap your client
from privacylens import shield
# Pick your LLM client — wrap it with shield()
import openai
client = shield(openai.OpenAI())
Step 2: Use it normally
response = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "user",
"content": "My name is John Doe and my email is john@example.com. Write me a welcome email."
}],
)
print(response.choices[0].message.content)
# Output contains "John Doe" and "john@example.com" — restored automatically
That's it. No other code changes needed. The PII is masked before it reaches the LLM and unmasked in the response.
Works With Every Major LLM Client
from privacylens import shield
# OpenAI
client = shield(openai.OpenAI())
client = shield(openai.AsyncOpenAI())
# Anthropic
client = shield(anthropic.Anthropic())
client = shield(anthropic.AsyncAnthropic())
# LangChain — returns a callback handler
handler = shield(my_langchain_chat_model)
# CrewAI
adapter = shield(my_crewai_agent)
# Strands
wrapper = shield(my_strands_model)
Each wrapped client behaves exactly like the original. Same methods, same parameters, same return types.
What Gets Detected
Built-in (regex, zero dependencies)
| Entity | Example Input | What LLM Sees |
|---|---|---|
john@example.com |
[EMAIL_1] |
|
| Phone | 555-123-4567 |
[PHONE_1] |
| SSN | 123-45-6789 |
[SSN_1] |
Optional: Presidio (50+ entity types)
pip install privacylens[pii]
Detects names, addresses, credit card numbers, dates of birth, passport numbers, and more using Microsoft Presidio.
Optional: GLiNER (ML-based semantic detection)
pip install privacylens[semantic]
Uses a neural model to detect entities that regex can't catch.
Custom Detectors
Add your own patterns via privacylens.yaml in your project root:
detectors:
regex:
patterns:
- entity_type: EMAIL
pattern: '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
- entity_type: EMPLOYEE_ID
pattern: 'EMP-\d{5,}'
- entity_type: PROJECT_CODE
pattern: 'PROJ-[A-Z]{2,4}-\d{3,}'
Vault Backends
Tokens are stored in a vault so they can be restored later. Three backends available:
# In-memory (default) — fast, lost on restart
vault: memory
# SQLite — persists to disk
vault: sqlite
# Redis — shared across processes
vault: redis
For Redis:
pip install privacylens[redis]
Inspect Without Masking
See what would be detected without actually masking anything:
from privacylens import inspect
entities = inspect("Contact john@example.com or call 555-123-4567")
for entity in entities:
print(f"{entity.entity_type}: '{entity.value}' at [{entity.start}:{entity.end}]")
# EMAIL: 'john@example.com' at [8:24]
# PHONE: '555-123-4567' at [33:45]
Low-Level API
For full control over the pipeline:
from privacylens.core.pipeline import Pipeline
from privacylens.core.config import load_config
config = load_config()
pipeline = Pipeline(config)
# Tokenize
messages = [{"role": "user", "content": "Email john@example.com"}]
tokenized = pipeline.tokenize_messages(messages, session_id="s1")
# Send to LLM (tokenized messages have PII replaced)
llm_response = call_your_llm(tokenized)
# Detokenize
restored = pipeline.detokenize(llm_response, session_id="s1")
Links
- GitHub: github.com/Madan2248c/privacylens
- TypeScript SDK:
npm install privacylens— docs - Contributing: CONTRIBUTING.md
License
MIT © 2026 Madan Gopal
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file privacylens-0.1.2.tar.gz.
File metadata
- Download URL: privacylens-0.1.2.tar.gz
- Upload date:
- Size: 44.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ec04bfa4336cd2f0ee0a9ef19395a6b2a896a031212f9cf67582f6620f1ea2c
|
|
| MD5 |
d254a828c5ec49e239498c7fffb7616e
|
|
| BLAKE2b-256 |
d107b49056b36935efc0447f6fa5e680892f299c7b4aea43ca7a2d66fedc90b6
|
File details
Details for the file privacylens-0.1.2-py3-none-any.whl.
File metadata
- Download URL: privacylens-0.1.2-py3-none-any.whl
- Upload date:
- Size: 28.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83ddea812574cc5d9a50866413b90e2aecbb3929398400cc81c9beef92ff6813
|
|
| MD5 |
c7ed0b516b8832643add306a23ac18a8
|
|
| BLAKE2b-256 |
993e7b552b96f43f0a5868b6c8b4b150f93c8b6b913d32dcca98157816e938d3
|