Skip to main content

Python SDK for Blindfold Gateway - Privacy API for AI

Project description

Blindfold Python SDK

Detect, redact, tokenize, and mask PII in Python. 80+ entity types, 30+ countries, works offline with zero dependencies.

PyPI version License

Why Blindfold?

  • Works offline, zero dependencies — No API key needed for local detection. No network calls. No external packages.
  • 80+ PII entity types across 30+ countries with checksum validation (Luhn, IBAN mod-97, Verhoeff, etc.)
  • 85x faster than Presidio — 0.4s vs 34s on 3,000 samples (benchmark)
  • Higher accuracy — F1 58.6% vs Presidio 38.8% on AI4Privacy multilingual benchmark
  • 8 operations: detect, redact, tokenize, detokenize, mask, hash, encrypt, synthesize
  • Compliance-ready — Built-in GDPR, HIPAA, PCI-DSS policies
  • Optional NLP upgrade — Add API key to detect names, addresses, organizations (60+ additional entities)
  • Batch processing, async support, typed errors

Quick Comparison

Feature Blindfold Presidio regex-only
Entity types (local) 80+ ~20 Custom
Countries 30+ ~5 Custom
Checksum validation Luhn, mod-97, Verhoeff, ... Partial No
Speed (3K samples) 0.4s 34s Varies
Zero dependencies Yes No (spaCy) Yes
NLP upgrade path Yes (API) Yes (built-in) No
Tokenize/detokenize Yes No No

Common Use Cases

  • Sanitize LLM prompts — Strip PII before sending to OpenAI, Anthropic, etc.
  • PII-safe RAG pipelines — Redact before embedding, restore after retrieval
  • Log scrubbing — Anonymize data in logs and data pipelines
  • GDPR/HIPAA compliance — Built-in policies for AI applications
  • Synthetic test data — Format-preserving fake data generation

Install

pip install blindfold-sdk

Quick Start (no API key needed)

from blindfold import Blindfold

client = Blindfold()

# Detect PII locally — no API key, no network call
result = client.detect("Email john@acme.com, SSN 123-45-6789")
for entity in result.detected_entities:
    print(f"{entity.type}: {entity.text} (score: {entity.score})")
# Email Address: john@acme.com (score: 0.95)
# Social Security Number: 123-45-6789 (score: 1.0)

# Redact PII locally
result = client.redact("Email john@acme.com, SSN 123-45-6789")
print(result.text)
# "Email, SSN"

Protect AI Prompts

Tokenize PII before sending to any LLM. The AI never sees real data.

OpenAI

from blindfold import Blindfold
from openai import OpenAI

bf = Blindfold()  # Free local mode
openai_client = OpenAI()

# 1. Tokenize PII
safe = bf.tokenize("My name is John Smith, email john@acme.com")
# safe.text → "My name is <Person_1>, email <Email Address_1>"

# 2. Send to GPT — PII never reaches OpenAI
response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": safe.text}]
)

# 3. Restore original data
result = bf.detokenize(response.choices[0].message.content, safe.mapping)
print(result.text)

Anthropic Claude

from blindfold import Blindfold
import anthropic

bf = Blindfold()
client = anthropic.Anthropic()

safe = bf.tokenize("My name is John Smith, email john@acme.com")

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": safe.text}]
)

result = bf.detokenize(response.content[0].text, safe.mapping)
print(result.text)

Works with any AI provider: OpenAI, Anthropic Claude, Google Gemini, AWS Bedrock, Azure OpenAI, LangChain, LlamaIndex, Vercel AI SDK, CrewAI — see all integrations.

Upgrade to Blindfold API (optional)

For names, addresses, organizations, and 60+ entity types, add your API key:

  1. Sign up at blindfold.dev
  2. Get your API key at app.blindfold.dev/api-keys
  3. Set environment variable: BLINDFOLD_API_KEY=sk-***
# With API key — auto-switches to NLP-powered API
client = Blindfold(api_key="sk-...")
result = client.detect("John Smith lives at 123 Oak Street")

Initialization

from blindfold import Blindfold

# Local mode (no API key) — regex-based detection
client = Blindfold()

# API mode (with API key) — NLP-powered detection
client = Blindfold(api_key="sk-...")

# Force local mode even with an API key (useful for latency-critical paths)
client = Blindfold(api_key="sk-...", mode="local")

Operations

Tokenize (Reversible)

Replace sensitive data with reversible tokens (e.g., <Person_1>).

response = client.tokenize(
    text="Contact John Doe at john@example.com",
    policy="gdpr_eu",  # Optional: 'hipaa_us', 'basic', 'pci_dss', 'strict'
    entities=["person", "email address"],  # Optional: filter entities
    score_threshold=0.4  # Optional: confidence threshold
)

print(response.text)
# "Contact <Person_1> at <Email Address_1>"

print(response.mapping)
# { "<Person_1>": "John Doe", "<Email Address_1>": "john@example.com" }

Detokenize

Restore original values from tokens. Runs client-side — no API call.

original = client.detokenize(
    text="AI response for <Person_1>",
    mapping=response.mapping
)
print(original.text)
# "AI response for John Doe"

Redact

Permanently remove sensitive data.

response = client.redact("My password is secret123")

Mask

Partially hide sensitive data (e.g., ****-****-****-1234).

response = client.mask(
    text="Credit card: 4532-7562-9102-3456",
    masking_char="*",
    chars_to_show=4,
    from_end=True
)
print(response.text)
# "Credit card: ***************3456"

Hash

Replace data with deterministic hashes (useful for analytics/matching).

response = client.hash(
    text="User ID: 12345",
    hash_type="sha256",
    hash_prefix="ID_"
)

Encrypt

Encrypt sensitive data using AES (reversible with key).

response = client.encrypt(
    text="Secret message",
    encryption_key="your-secure-key-min-16-chars"
)

Synthesize

Replace real data with realistic fake data. Works offline with format-preserving generation.

# Works offline — no API key required
client = Blindfold()
response = client.synthesize("Email john@acme.com, SSN 123-45-6789")
print(response.text)
# "Email user3a9f1b2c@example.com, SSN 847-29-3156"

# With API key — NLP-powered synthesis (names, addresses, etc.)
response = client.synthesize("John lives in New York", language="en")
print(response.text)
# "Michael lives in Boston"

Batch Processing

Process multiple texts in a single request (max 100 texts):

result = client.tokenize_batch(
    ["Contact John Doe", "jane@example.com", "No PII here"],
    policy="gdpr_eu"
)

print(result.total)       # 3
print(result.succeeded)   # 3
print(result.failed)      # 0

for item in result.results:
    print(item["text"])

All methods have batch variants: tokenize_batch, detect_batch, redact_batch, mask_batch, synthesize_batch, hash_batch, encrypt_batch.

Async Usage

import asyncio
from blindfold import AsyncBlindfold

async def main():
    async with AsyncBlindfold(api_key="...") as client:
        response = await client.tokenize("Hello John")
        print(response.text)

        # detokenize is synchronous — no await needed
        original = client.detokenize(response.text, response.mapping)
        print(original.text)

asyncio.run(main())

Local PII Scanner

The built-in regex scanner works offline with zero dependencies. Use it directly for fine-grained control:

from blindfold.regex import PIIScanner, EntityType

# Default: US locale
scanner = PIIScanner()
matches = scanner.detect("Call me at john@acme.com or 555-867-5309")

for match in matches:
    print(f"{match.entity_type}: {match.text} (score: {match.score})")

# Redact PII
redacted_text, matches = scanner.redact("SSN 123-45-6789, CC 4532015112830366")
print(redacted_text)
# "SSN, CC"

Multi-locale support

# US + EU entities
scanner = PIIScanner(locales=["us", "eu"])
matches = scanner.detect("SSN 123-45-6789, IBAN DE89370400440532013000")

# UK entities
scanner = PIIScanner(locales=["uk"])
matches = scanner.detect("NI number: AB 12 34 56 A")

# All locales
scanner = PIIScanner(locales=["us", "eu", "uk"])

Filter by entity type

# Only detect emails and credit cards
scanner = PIIScanner(entities=[EntityType.EMAIL, EntityType.CREDIT_CARD])

Error Handling

from blindfold.errors import AuthenticationError, APIError, NetworkError

try:
    client.tokenize("...")
except AuthenticationError:
    # Handle invalid API key
    pass
except APIError as e:
    # Handle API error (e.g. validation)
    print(e)
except NetworkError:
    # Handle network issues
    pass
Supported local entity types (80+)
Entity Type Locale Validation
Email Address Universal RFC 5322 pattern
Credit Card Number Universal Luhn checksum
Phone Number Universal Format + digit count
IP Address (v4/v6) Universal Octet range
URL Universal TLD validation
MAC Address Universal Pattern
Date of Birth Universal Context-required
CVV/CVC Universal Context-required
Social Security Number US Format rules + context
Driver's License US Multi-state formats + context
US Passport US Context-required
Tax ID / EIN US Prefix validation + context
ZIP Code US Context-required + validator
US ITIN US Format validation
IBAN EU ISO 7064 mod-97 checksum
Postal Code EU DE/FR/NL patterns
VAT ID EU Country prefix + format
UK NI Number UK Format validation
UK NHS Number UK Modulus-11 checksum
UK Postcode UK Pattern
UK Passport UK Context-required
UK UTR UK Mod-11 checksum
German Personal ID DE Context-required
German Tax ID DE Check digit
French National ID (NIR) FR Check digit
French SIREN FR Luhn checksum
Spanish DNI ES Letter validation
Spanish NIE ES Letter validation
Spanish NSS ES Mod-97 checksum
Spanish CIF ES Custom checksum
Italian Codice Fiscale IT Check digit
Italian Partita IVA IT Luhn-like checksum
Portuguese NIF PT Check digit
Dutch BSN NL Modulus-11 check
Belgian National Number BE Mod-97 checksum
Belgian Enterprise Number BE Mod-97 checksum
Austrian SVNR AT Mod-11 checksum
Swiss AHV CH EAN-13 checksum
Irish PPS Number IE Mod-23 checksum
Polish PESEL PL Check digit
Polish NIP PL Check digit
Polish REGON PL Mod-11 checksum
Czech Birth Number CZ Modulus validation
Czech ICO (Company ID) CZ Mod-11 weighted checksum
Czech DIC (Tax/VAT ID) CZ ICO checksum / mod-11
Czech Bank Account CZ Mod-11 weighted checksum
Slovak Birth Number SK Modulus validation
Slovak ICO SK Mod-11 weighted checksum
Slovak DIC SK Mod-11 divisibility
Romanian CNP RO Check digit
Romanian CUI RO Mod-11 checksum
Danish CPR DK Date validation
Danish CVR DK Mod-11 checksum
Swedish Personnummer SE Luhn algorithm
Swedish Organisationsnummer SE Luhn algorithm
Norwegian Birth Number NO Check digit
Norwegian Organisasjonsnummer NO Mod-11 checksum
Finnish HETU FI Mod-31 checksum
Finnish Y-tunnus FI Mod-11 checksum
Hungarian Tax ID HU Mod-11 checksum
Hungarian TAJ HU Mod-10 checksum
Bulgarian EGN BG Mod-11 checksum
Croatian OIB HR ISO 7064 MOD 11,2
Slovenian EMSO SI Mod-11 checksum
Slovenian Tax Number SI Mod-11 checksum
Lithuanian Personal Code LT Dual-pass mod-11
Latvian Personal Code LV Weighted checksum
Estonian Personal Code EE Dual-pass mod-11
Russian INN RU Check digit
Russian SNILS RU Check digit
Canadian SIN CA Luhn checksum
Australian TFN AU Mod-11 checksum
Australian Medicare AU Mod-10 checksum
New Zealand IRD NZ Dual-pass mod-11
Indian Aadhaar IN Verhoeff algorithm
Indian PAN IN Format validation
Japanese My Number JP Mod-11 checksum
Korean RRN KR Weighted checksum
South African ID ZA Luhn checksum
Turkish Kimlik TR Custom dual check
Israeli ID IL Luhn checksum
Brazilian CPF BR Check digit
Brazilian CNPJ BR Check digit
Argentine CUIT AR Mod-11 checksum
Chilean RUT CL Mod-11 with K
Colombian NIT CO Mod-11 prime weights

Add your API key to unlock names, addresses, organizations, and 60+ additional entity types with NLP-powered detection.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blindfold_sdk-1.0.2.tar.gz (99.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

blindfold_sdk-1.0.2-py3-none-any.whl (98.1 kB view details)

Uploaded Python 3

File details

Details for the file blindfold_sdk-1.0.2.tar.gz.

File metadata

  • Download URL: blindfold_sdk-1.0.2.tar.gz
  • Upload date:
  • Size: 99.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for blindfold_sdk-1.0.2.tar.gz
Algorithm Hash digest
SHA256 2454f12413bec75f061f6cb389a5ad051887dc98aaa31d718227d2f130b7d386
MD5 1b837849e04d13de5be76e1f5714e181
BLAKE2b-256 ce92c707dfdcbe80c5ac3161e0851980cf88af63a17dd483489b0a003b26cb0f

See more details on using hashes here.

File details

Details for the file blindfold_sdk-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: blindfold_sdk-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 98.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for blindfold_sdk-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e8ce956a6e6c81a1e49115f073cc85d7477b2106c84e27394cd622b492948efd
MD5 bf14678d2de469a9e7193b5b3e2dd90b
BLAKE2b-256 7272f97bc79c7c541cc8c207e946c6c3acf3d88858835c35fc22799639a0ca3f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page