Automatic PII redaction for Pydantic v2 — masks sensitive data in logs and print statements. GDPR/HIPAA-friendly.
Project description
GhostPII 👻
Automatic PII redaction for Pydantic v2 — zero-config, GDPR/HIPAA-friendly.
Note: This project is published on PyPI as
ghost-pii-pydantic.
GhostPII solves the "Logged Secret" problem: sensitive fields (emails, SSNs, credit card numbers, API keys) leaking into logs and tracebacks. It provides a smart string proxy that automatically redacts itself in unsafe contexts (logging, print, tracebacks) while remaining fully functional for business logic, databases, and APIs.
- Drop-in Pydantic v2
Annotatedtype — no middleware, no post-processing - Tainted memory propagation — concatenated strings stay redacted
- Strict mode for FinTech / HealthTech / high-compliance environments
- Works with sync and async Python services
Features
| Feature | Description |
|---|---|
| Auto-Magical Redaction | Automatically detects print(), logging, structlog, loguru, and more. |
| Partial Masking | Show jo***@ex***.com instead of [REDACTED] — ideal for UIs and audit logs. |
| Pydantic Native | First-class support for Pydantic v2 Annotated types. |
| Strict Mode | Opt-in for 100% redaction everywhere unless explicitly unmasked. |
| Tainted Memory | Operations on PII (like concatenation) stay PII. No accidental leaks. |
| Context Aware | unmask_pii() context manager with optional audit callback. |
| asyncio Safe | Uses contextvars.ContextVar — isolated per thread and per async task. |
| pytest Plugin | Built-in ghost_pii_strict fixture and --ghost-pii-strict CLI flag. |
| Extensible | Register custom unsafe modules (OpenTelemetry, Datadog, etc.) at runtime. |
Installation
pip install ghost-pii-pydantic
Quick Start
from pydantic import BaseModel, EmailStr
from ghost_pii import PII, unmask_pii
class User(BaseModel):
name: PII[str]
email: PII[EmailStr] # Validates as email (via Pydantic), redacts in logs
user = User(name="John Doe", email="john@example.com")
# 1. Safe by Default: Redacts in logs/prints
print(user)
# Output: name=GhostString('[REDACTED]') email=GhostString('[REDACTED]')
# 2. Functional: Works in business logic/DBs
# (String conversion or attribute access in non-unsafe contexts reveals the real string)
db.execute("INSERT INTO users VALUES (?)", [user.email])
# Successfully inserts "john@example.com"
3. Explicit: Use context manager for sensitive tasks
with unmask_pii(): print(user) # Output: name=GhostString('John Doe') email=GhostString('john@example.com')
## Advanced Scenarios
### Nested Models and Collections
GhostPII seamlessly handles nested Pydantic models and lists of PII.
```python
from typing import List
from ghost_pii import PII
class Address(BaseModel):
street: PII[str]
city: str
class Organization(BaseModel):
name: str
admin_emails: List[PII[EmailStr]]
headquarters: Address
org = Organization(
name="Acme Corp",
admin_emails=["admin@acme.com", "sec@acme.com"],
headquarters=Address(street="123 Secret Lane", city="New York")
)
print(org.model_dump())
# Output: {
# 'name': 'Acme Corp',
# 'admin_emails': ['[REDACTED]', '[REDACTED]'],
# 'headquarters': {'street': '[REDACTED]', 'city': 'New York'}
# }
Tainted Memory (Concatenation)
PII "infects" any string it touches. If you combine a PII field with a normal string, the result is a new GhostString that is also redacted by default.
labeled_name = "User: " + user.name
print(labeled_name) # Output: [REDACTED]
with unmask_pii():
print(labeled_name) # Output: User: John Doe
Partial Masking
Use masked_pii() when you need identifiable-but-safe values — customer service UIs, audit logs, support dashboards.
Supported strategies in MaskStrategy:
FULL: Always shows[REDACTED]. (Default)EMAIL: Partially masks local-part and domain, e.g.jo***@ex***.com.LAST4: Keeps the last four digits, e.g.****6789.PHONE: Keeps country prefix and last three digits, e.g.+44*****456.SSN: Shows only the last four digits in SSN format, e.g.***-**-6789.
from ghost_pii import masked_pii, MaskStrategy
class User(BaseModel):
email: masked_pii(EmailStr, MaskStrategy.EMAIL) # jo***@ex***.com
ssn: masked_pii(str, MaskStrategy.SSN) # ***-**-6789
card: masked_pii(str, MaskStrategy.LAST4) # ****1111
phone: masked_pii(str, MaskStrategy.PHONE) # +44*****456
user = User(email="john@example.com", ssn="123-45-6789",
card="4111111111111111", phone="+447911123456")
print(user.email) # jo***@ex***.com
print(user.ssn) # ***-**-6789
with unmask_pii():
print(user.email) # john@example.com
Audit Hook
Pass on_access to unmask_pii() to emit a compliance trail whenever PII is deliberately exposed — required for SOC2 / GDPR audit logs.
import logging
from ghost_pii import unmask_pii
audit = logging.getLogger("audit")
# The callback is triggered exactly once when entering the context manager
with unmask_pii(on_access=lambda: audit.info("PII accessed by service X")):
send_email(to=str(user.email))
Extending Unsafe Modules
GhostPII covers logging, structlog, loguru, rich, print, and test runners out of the box. Add your own:
from ghost_pii import add_unsafe_module
add_unsafe_module("opentelemetry")
add_unsafe_module("datadog")
Async Support
GhostPII works transparently in async services. The unmask_pii() context manager is sync-safe and can be used inside async functions:
import asyncio
from ghost_pii import PII, unmask_pii
class UserEvent(BaseModel):
user_id: str
email: PII[str]
async def send_confirmation(event: UserEvent):
# Logging is safe — email is auto-redacted
logger.info("Sending confirmation to %s", event.email)
with unmask_pii():
await smtp_client.send(to=str(event.email), subject="Confirm your account")
Enterprise Strategy
GhostPII is designed to adapt to different compliance levels:
| Mode | Recommended For | Mechanism |
|---|---|---|
| Auto-Magical | General microservices, high developer velocity. | Uses stack inspection to detect logging, print, etc. |
| Strict Mode | FinTech, HealthTech, High-Compliance environments. | Redacts everywhere. Requires explicit unmask_pii() to access data. |
Enabling Strict Mode
from ghost_pii import set_strict_mode
set_strict_mode(True) # Best practice for production PII handling
pytest Plugin
GhostPII ships a built-in pytest plugin (auto-registered via pytest11 entry point).
Per-test strict mode:
def test_no_pii_in_logs(ghost_pii_strict):
user = User(name="John Doe", email="john@example.com")
assert str(user.email) == "[REDACTED]" # strict: always redacted
with unmask_pii():
assert str(user.email) == "john@example.com"
Session-wide (CI enforcement):
pytest --ghost-pii-strict
Disable the plugin:
pytest -p no:ghost-pii
Why GhostPII vs Alternatives
| GhostPII | presidio | scrubadub | Manual field redaction | |
|---|---|---|---|---|
| Integration model | Pydantic Annotated type |
NLP pipeline / scrubber | String scrubber | Ad-hoc |
| Auto-redacts in logs | Yes — zero config | No | No | No |
| Preserves value for DB/API | Yes | No (destructive) | No (destructive) | Depends |
| Tainted memory propagation | Yes | No | No | No |
| Strict / audit mode | Yes | No | No | Manual |
| Setup overhead | pip install + type annotation |
NER models, language packs | Pattern config | High |
| Best for | Pydantic services, FastAPI, microservices | Bulk text anonymisation | Legacy string scrubbing | Simple one-off cases |
TL;DR: presidio and scrubadub are great for scrubbing free-text blobs. GhostPII is purpose-built for Pydantic models where you need the real value to flow through your app but never appear in logs.
Contributing
We follow strict engineering standards. Please ensure you run linters and tests before submitting PRs.
pip install -e ".[dev]"
pytest # run test suite
ruff check src/ghost_pii # lint
mypy src/ghost_pii # type-check
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Copyright (c) 2026 Sthitaprajna Sahoo and contributors.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ghost_pii_pydantic-0.2.1.tar.gz.
File metadata
- Download URL: ghost_pii_pydantic-0.2.1.tar.gz
- Upload date:
- Size: 18.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42ed2a66d1d167b0e2df595e10eefb86c4e2f9f3651716b9dd89de6201223f7c
|
|
| MD5 |
f8a0d41caf76cc7c56da18bb70aa25ba
|
|
| BLAKE2b-256 |
5558ca2d1a22f050934fa010d22b50f70d32087e9eec7e125eaa9090f90dddd2
|
Provenance
The following attestation bundles were made for ghost_pii_pydantic-0.2.1.tar.gz:
Publisher:
release.yml on STHITAPRAJNAS/ghost-pii-pydantic
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ghost_pii_pydantic-0.2.1.tar.gz -
Subject digest:
42ed2a66d1d167b0e2df595e10eefb86c4e2f9f3651716b9dd89de6201223f7c - Sigstore transparency entry: 1311279279
- Sigstore integration time:
-
Permalink:
STHITAPRAJNAS/ghost-pii-pydantic@36b991187d27310fcbaaac8f7198d10615303f79 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/STHITAPRAJNAS
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@36b991187d27310fcbaaac8f7198d10615303f79 -
Trigger Event:
push
-
Statement type:
File details
Details for the file ghost_pii_pydantic-0.2.1-py3-none-any.whl.
File metadata
- Download URL: ghost_pii_pydantic-0.2.1-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
855835e8360406ba8ce2f568dc741ed15e3ab8d67644a96695401cb6184e79b2
|
|
| MD5 |
9c59b49745de378e77fec5e86d1cc0db
|
|
| BLAKE2b-256 |
4c64d8f6728fb3f226d0c47a453d481d3dd7bd475a70b4200ee23f105ce06754
|
Provenance
The following attestation bundles were made for ghost_pii_pydantic-0.2.1-py3-none-any.whl:
Publisher:
release.yml on STHITAPRAJNAS/ghost-pii-pydantic
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ghost_pii_pydantic-0.2.1-py3-none-any.whl -
Subject digest:
855835e8360406ba8ce2f568dc741ed15e3ab8d67644a96695401cb6184e79b2 - Sigstore transparency entry: 1311279332
- Sigstore integration time:
-
Permalink:
STHITAPRAJNAS/ghost-pii-pydantic@36b991187d27310fcbaaac8f7198d10615303f79 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/STHITAPRAJNAS
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@36b991187d27310fcbaaac8f7198d10615303f79 -
Trigger Event:
push
-
Statement type: