Skip to main content

LangChain integration for Blindfold PII detection and protection

Project description

LangChain Blindfold

LangChain integration for Blindfold PII detection and protection. Tokenize PII before it reaches your LLM, then restore originals in the response.

Developed by Blindfold
License MIT
Input/Output String, Document

Installation

pip install langchain-blindfold

Set your Blindfold API key:

export BLINDFOLD_API_KEY=your-api-key

Get a free API key at app.blindfold.dev.

Quick Start

Protect a LangChain Chain

from langchain_blindfold import blindfold_protect
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

tokenize, detokenize = blindfold_protect(policy="basic")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}"),
])
llm = ChatOpenAI(model="gpt-4o-mini")

chain = tokenize | prompt | llm | (lambda msg: msg.content) | detokenize

# PII is tokenized before the LLM sees it, then restored in the response
result = chain.invoke("Write a follow-up email to John Doe at john@example.com")

Transform Documents for RAG

from langchain_blindfold import BlindfoldPIITransformer
from langchain_core.documents import Document

transformer = BlindfoldPIITransformer(pii_method="redact", policy="hipaa_us", region="us")

docs = [Document(page_content="Patient John Smith, SSN 123-45-6789")]
safe_docs = transformer.transform_documents(docs)
# safe_docs[0].page_content → "Patient [REDACTED], SSN [REDACTED]"

Components

blindfold_protect()

Convenience function that returns a paired tokenizer and detokenizer:

tokenize, detokenize = blindfold_protect(
    api_key=None,         # Falls back to BLINDFOLD_API_KEY env var
    region=None,          # "eu" or "us" for data residency
    policy="basic",       # Detection policy
    entities=None,        # Specific entity types to detect
    score_threshold=None, # Confidence threshold (0.0-1.0)
)

BlindfoldTokenizer

A LangChain Runnable that tokenizes PII in text:

from langchain_blindfold import BlindfoldTokenizer

tokenizer = BlindfoldTokenizer(policy="gdpr_eu", region="eu")
safe_text = tokenizer.invoke("Contact Hans at hans@example.de")
# → "Contact <Person_1> at <Email Address_1>"

BlindfoldDetokenizer

A LangChain Runnable that restores original PII from tokenized text:

from langchain_blindfold import BlindfoldTokenizer, BlindfoldDetokenizer

tokenizer = BlindfoldTokenizer(api_key="...")
detokenizer = BlindfoldDetokenizer(tokenizer=tokenizer)

tokenizer.invoke("Hi John")  # stores mapping
result = detokenizer.invoke("Response to <Person_1>")
# → "Response to John"

BlindfoldPIITransformer

A LangChain DocumentTransformer for protecting PII in documents:

from langchain_blindfold import BlindfoldPIITransformer

transformer = BlindfoldPIITransformer(
    api_key=None,         # Falls back to BLINDFOLD_API_KEY env var
    region=None,          # "eu" or "us" for data residency
    policy="basic",       # Detection policy
    pii_method="tokenize",# tokenize, redact, mask, hash, synthesize, encrypt
    entities=None,        # Specific entity types to detect
    score_threshold=None, # Confidence threshold (0.0-1.0)
)

When pii_method="tokenize", the mapping is stored in doc.metadata["blindfold_mapping"].

Policies

Policy Entities Best For
basic Names, emails, phones, locations General PII protection
gdpr_eu EU-specific: IBANs, addresses, dates of birth GDPR compliance
hipaa_us PHI: SSNs, MRNs, medical terms HIPAA compliance
pci_dss Card numbers, CVVs, expiry dates PCI DSS compliance
strict All entity types, lower threshold Maximum detection

PII Methods

Method Output Reversible
tokenize <Person_1>, <Email Address_1> Yes
redact PII removed entirely No
mask J****oe, j****om No
hash HASH_abc123 No
synthesize Jane Smith, jane@example.org No
encrypt AES-256 encrypted value Yes (with key)

Data Residency

Use the region parameter to ensure PII is processed in a specific jurisdiction:

  • region="eu" — processed in Frankfurt, Germany
  • region="us" — processed in Virginia, US
tokenize, detokenize = blindfold_protect(policy="gdpr_eu", region="eu")

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langchain_blindfold-0.1.0.tar.gz (9.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langchain_blindfold-0.1.0-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file langchain_blindfold-0.1.0.tar.gz.

File metadata

  • Download URL: langchain_blindfold-0.1.0.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for langchain_blindfold-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a9308f20d2383989399a7588b1dddd72b360a47f5502bdaceff0f0f2f3ddc8e2
MD5 5d22cc214ea03937686ee9b9851f3d36
BLAKE2b-256 8bf9da95a68403a6d243c82f27be3281b3ff133dba5d81308a72fbcd5454aeb3

See more details on using hashes here.

File details

Details for the file langchain_blindfold-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langchain_blindfold-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0de9e5699e85ccf2ac27bcb5d656a39ed1f1991eb525e36ddaa52d72ef779b3d
MD5 ed17adf421aa894894e417a6640f0f5c
BLAKE2b-256 1eb780e724550707aba37a22811190aa5859efc5cf55150cd15d5f7fb675ca37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page