Deterministic privacy-preserving logger for Python.
Project description
BlindLog v1.1
BlindLog is a zero-dependency, production-ready Privacy-Preserving Observability SDK for Python.
It solves the fundamental conflict in backend engineering: The developer needs to see everything to fix bugs, but compliance constraints (GDPR/HIPAA/SOC2) dictate that you cannot see anything personal.
By replacing raw Personal Identifiable Information (PII) with consistent, structure-preserving deterministic hashes, developers retain perfect system observability without leaking actual identities into application logs.
💡 The "Why": Why Use BlindLog?
The Problem with Redaction
Legacy redaction tools look for emails or credit cards and replace them with static text like ***** or [REDACTED]. The fatal flaw here is context destruction. If your logs show [REDACTED] failed to purchase order [REDACTED], you cannot trace a specific user's journey through your microservices when every trace of their identity maps to the exact same generic string.
The BlindLog Solution: Deterministic Pseudonymization
BlindLog uses natively-keyed BLAKE2b cryptography to consistently map data:
user1@gmail.comalways logs asblnd_ref_8ax92bfac000...@masked.com.user2@gmail.comalways logs asblnd_ref_1c89f81ba000...@masked.com.
You instantly know if the same user triggered 50 errors across 4 microservices over a week, while remaining legally compliant because the raw identity is cryptographically destroyed.
🚀 Installation
BlindLog has zero external dependencies and runs natively on Python 3.8+.
pip install blindlog
🛠️ Exactly How to Use It
BlindLog is built on an extensible architecture that operates automatically once plugged into your existing application. It intercepts data, traverses JSON payloads recursively, and masks strings without breaking schema.
1. Mandatory Security Configuration
BlindLog operates on keyed hashes. To prevent rainbow-table reverse-engineering, you must supply a cryptographic secret.
Set the following environment variables on your production servers:
export BLINDLOG_SECRET="your-super-strong-random-secret-key"
export BLINDLOG_SALT="optional-additional-salt"
Warning: If
BLINDLOG_SECRETis missing, BlindLog will violently crash on boot to protect your system from generating reversible, unkeyed hashes. For local development, you can setexport BLINDLOG_DEBUG="true"to bypass this crash.
2. Standard Python Logging Interception
BlindLog ships with a logging.Formatter that hooks directly into Python's native logging module. It intercepts all string messages, dictionary args, and even Exception tracebacks to scrub PII before it hits your terminal or logging aggregator (like Datadog/Elasticsearch).
import logging
from blindlog.formatters import BlindLogFormatter
# 1. Initialize your logger
logger = logging.getLogger("my_application")
logger.setLevel(logging.INFO)
# 2. Create a handler
handler = logging.StreamHandler()
# 3. Attach the BlindLogFormatter!
handler.setFormatter(BlindLogFormatter())
logger.addHandler(handler)
# Usage A: Standard Strings (Slow Path - Regex Scanning)
logger.info("Failed login for akhand@gmail.com on card 4111-2222-3333-4444")
# Output: Failed login for blnd_ref_8a9df2c00000...@masked.com on card 4111-c918a2-f8b1c4-4444
# Usage B: Dictionary Arguments (Fast Path - Key Matching)
# BlindLog detects keys like 'password' or 'email' instantly.
logger.info("User created", {"email": "ceo@corp.com", "password": "super-secret"})
# Output: User created {'email': 'blnd_ref_9bf... masked', 'password': 'blind:838ab...'}
# Usage C: Safe Exceptions
try:
raise ValueError("User akhand@gmail.com exhausted their API quota")
except ValueError:
logger.exception("A system error occurred")
# Output: The stack trace is fully processed, and akhand@gmail.com is masked inside the Traceback!
3. FastAPI & Starlette Middleware
BlindLog acts as a Pure ASGI Middleware. It intercepts raw HTTP traffic before it hits your application routers. It safely buffers HTTP payloads (up to 5MB to prevent OOM DOS) and logs masked Request Bodies, Response Bodies, and Headers.
from fastapi import FastAPI
from blindlog.integrations.fastapi import BlindLogFastAPIMiddleware
app = FastAPI()
# Attach the middleware
app.add_middleware(BlindLogFastAPIMiddleware)
@app.post("/checkout")
async def checkout(payload: dict):
# If the user sends {"credit_card": "4111-...", "cookie": "session_123"},
# The middleware automatically logs the sanitized payload to standard out.
return {"status": "success"}
What the Middleware handles automatically:
- Request/Response Bodies: Deeply nested JSON is recursively traversed and masked.
- HTTP Headers: Sensitive context headers (like
Authorization,Cookie,X-API-Key) are extracted and encrypted without losing duplicate associations. - Streaming Protections: Safe passage for WebSockets and SSE pipelines.
4. Customizing the Configuration (BlindLogConfig)
You can tune BlindLog's rules by defining a BlindLogConfig. Once created, the config is frozen (immutable) to prevent runtime tampering.
from blindlog.core import BlindLogger
from blindlog.config import BlindLogConfig
# 1. Define custom sensitive keys.
# Note: This overwrites the defaults, so add your specific database fields.
custom_keys = frozenset({"internal_db_id", "auth_token", "email"})
config = BlindLogConfig(
secret_key="my-custom-key", # Will fall back to ENV var if omitted
sensitive_keys=custom_keys,
debug_mode=False
)
logger = BlindLogger(config=config)
Key Matching Engine (Fast Path): BlindLog checks dictionary keys via:
- Exact Match: e.g.,
"email"=="email". - Suffix Match: e.g.,
"user_password"ends with"_password". - Normalization: Hyphens (
x-api-key) and camelCase (APIKey) are normalized tosnake_casebefore matching, ensuring maximum coverage across varying schemas.
5. Custom Format Registration (Adding Regex)
BlindLog implements a Registry Pattern. If you have custom internal tokens (e.g., specific AWS KMS IDs or internal Employee IDs) you can teach BlindLog to find and format them dynamically in free-text.
import re
from blindlog.core import BlindLogger
logger = BlindLogger()
# 1. Define a regex pattern
aws_pattern = re.compile(r"AWS-KMS-\d{6}")
# 2. Define a callback function that takes the matched string and returns a safe string
# Note: You can also hash it dynamically inside the callback if you wish!
def mask_aws(matched_string: str) -> str:
return "blnd_aws_TOKEN_REDACTED"
# 3. Register the rule
logger.registry.register(aws_pattern, mask_aws)
masked = logger.mask("Exception: AWS-KMS-123456 failed to load.")
# Output: "Exception: blnd_aws_TOKEN_REDACTED failed to load."
🛡️ Default Out-Of-The-Box Protections
BlindLog actively protects the following data types automatically via RegEx and Key-discovery:
- Emails: Truncated and hashed (
blnd_ref_HASH...@masked.com) - Credit Cards: Preserves major industry format (
4111-HASH-HASH-1234) - API Keys & Tokens: Covers OpenAI, Stripe, AWS, Slack, and GitHub PATs natively.
- Phone Numbers: International and NANP routing.
- Social Security Numbers (SSN): US Formats.
- IPv4 Addresses: Validated octet arrays.
- HTTP/Web Standard Keys:
cookie,set_cookie,authorization,password,secret,private_key,credentials.
🧮 Idempotency & System Guarantees
- Idempotent Masking Guarantees: BlindLog uses strict structural RegEx evaluations (
MASKED_PATTERN). If you pass already-masked data into the engine multiple times, it skips it instantly. You will never double-hash a log. - ReDoS Mitigation: Slow-path Regex execution cuts off after 10,000 characters, averting CPU exhaustion DOS attacks.
- OOM Prevention: Middlewares strictly cap at 5MB buffer payloads.
- Type Sabotage Checks: Gracefully handles
None,True, and circular nested dictionary references without crashing or returning unmasked memory addresses.
For further exploration, please review our Architecture Guide and the CHANGELOG.md!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file blindlog-1.1.0.tar.gz.
File metadata
- Download URL: blindlog-1.1.0.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05f82e29e6218387aee76fbf47ccaf3241f4b6de0e97dc8cf29e83ecd806e65f
|
|
| MD5 |
774dea2b53cebd6187c71de9e8b13099
|
|
| BLAKE2b-256 |
5a03dd0f37cbbe5ee5a01f62d48c97afaebdadf0c0a58034afef4df7521943a0
|
File details
Details for the file blindlog-1.1.0-py3-none-any.whl.
File metadata
- Download URL: blindlog-1.1.0-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4dac24df63d3d0c9dae55037636b1e6b49fb5f3485e276d1840be45970fcfe09
|
|
| MD5 |
ec5edbf64f006c5a6105f27088c4f9b8
|
|
| BLAKE2b-256 |
2550fc767517dfa598d7ee643f68c31b62f2b4a1280d7381087f80ef02a0a784
|