LangChain integration for Blindfold PII detection and protection
Project description
LangChain Blindfold
LangChain integration for Blindfold PII detection and protection. Tokenize PII before it reaches your LLM, then restore originals in the response.
| Developed by | Blindfold |
| License | MIT |
| Input/Output | String, Document |
Installation
pip install langchain-blindfold
Set your Blindfold API key:
export BLINDFOLD_API_KEY=your-api-key
Get a free API key at app.blindfold.dev.
Quick Start
Protect a LangChain Chain
from langchain_blindfold import blindfold_protect
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
tokenize, detokenize = blindfold_protect(policy="basic")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("user", "{input}"),
])
llm = ChatOpenAI(model="gpt-4o-mini")
chain = tokenize | prompt | llm | (lambda msg: msg.content) | detokenize
# PII is tokenized before the LLM sees it, then restored in the response
result = chain.invoke("Write a follow-up email to John Doe at john@example.com")
Transform Documents for RAG
from langchain_blindfold import BlindfoldPIITransformer
from langchain_core.documents import Document
transformer = BlindfoldPIITransformer(pii_method="redact", policy="hipaa_us", region="us")
docs = [Document(page_content="Patient John Smith, SSN 123-45-6789")]
safe_docs = transformer.transform_documents(docs)
# safe_docs[0].page_content → "Patient [REDACTED], SSN [REDACTED]"
Components
blindfold_protect()
Convenience function that returns a paired tokenizer and detokenizer:
tokenize, detokenize = blindfold_protect(
api_key=None, # Falls back to BLINDFOLD_API_KEY env var
region=None, # "eu" or "us" for data residency
policy="basic", # Detection policy
entities=None, # Specific entity types to detect
score_threshold=None, # Confidence threshold (0.0-1.0)
)
BlindfoldTokenizer
A LangChain Runnable that tokenizes PII in text:
from langchain_blindfold import BlindfoldTokenizer
tokenizer = BlindfoldTokenizer(policy="gdpr_eu", region="eu")
safe_text = tokenizer.invoke("Contact Hans at hans@example.de")
# → "Contact <Person_1> at <Email Address_1>"
BlindfoldDetokenizer
A LangChain Runnable that restores original PII from tokenized text:
from langchain_blindfold import BlindfoldTokenizer, BlindfoldDetokenizer
tokenizer = BlindfoldTokenizer(api_key="...")
detokenizer = BlindfoldDetokenizer(tokenizer=tokenizer)
tokenizer.invoke("Hi John") # stores mapping
result = detokenizer.invoke("Response to <Person_1>")
# → "Response to John"
BlindfoldPIITransformer
A LangChain DocumentTransformer for protecting PII in documents:
from langchain_blindfold import BlindfoldPIITransformer
transformer = BlindfoldPIITransformer(
api_key=None, # Falls back to BLINDFOLD_API_KEY env var
region=None, # "eu" or "us" for data residency
policy="basic", # Detection policy
pii_method="tokenize",# tokenize, redact, mask, hash, synthesize, encrypt
entities=None, # Specific entity types to detect
score_threshold=None, # Confidence threshold (0.0-1.0)
)
When pii_method="tokenize", the mapping is stored in doc.metadata["blindfold_mapping"].
Policies
| Policy | Entities | Best For |
|---|---|---|
basic |
Names, emails, phones, locations | General PII protection |
gdpr_eu |
EU-specific: IBANs, addresses, dates of birth | GDPR compliance |
hipaa_us |
PHI: SSNs, MRNs, medical terms | HIPAA compliance |
pci_dss |
Card numbers, CVVs, expiry dates | PCI DSS compliance |
strict |
All entity types, lower threshold | Maximum detection |
PII Methods
| Method | Output | Reversible |
|---|---|---|
tokenize |
<Person_1>, <Email Address_1> |
Yes |
redact |
PII removed entirely | No |
mask |
J****oe, j****om |
No |
hash |
HASH_abc123 |
No |
synthesize |
Jane Smith, jane@example.org |
No |
encrypt |
AES-256 encrypted value | Yes (with key) |
Data Residency
Use the region parameter to ensure PII is processed in a specific jurisdiction:
region="eu"— processed in Frankfurt, Germanyregion="us"— processed in Virginia, US
tokenize, detokenize = blindfold_protect(policy="gdpr_eu", region="eu")
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_blindfold-0.1.0.tar.gz.
File metadata
- Download URL: langchain_blindfold-0.1.0.tar.gz
- Upload date:
- Size: 9.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9308f20d2383989399a7588b1dddd72b360a47f5502bdaceff0f0f2f3ddc8e2
|
|
| MD5 |
5d22cc214ea03937686ee9b9851f3d36
|
|
| BLAKE2b-256 |
8bf9da95a68403a6d243c82f27be3281b3ff133dba5d81308a72fbcd5454aeb3
|
File details
Details for the file langchain_blindfold-0.1.0-py3-none-any.whl.
File metadata
- Download URL: langchain_blindfold-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0de9e5699e85ccf2ac27bcb5d656a39ed1f1991eb525e36ddaa52d72ef779b3d
|
|
| MD5 |
ed17adf421aa894894e417a6640f0f5c
|
|
| BLAKE2b-256 |
1eb780e724550707aba37a22811190aa5859efc5cf55150cd15d5f7fb675ca37
|