Skip to main content

Metadata-driven sensitive-data masking mixin for Python frozen dataclasses

Reason this release was yanked:

Renamed to sensitivity-mixin. Install sensitivity-mixin instead.

Project description

pii-aware-mixin

Decorator-based PHI/PII masking for Python frozen dataclasses.

Accidentally logging sensitive data—API tokens, passwords, session IDs, PII, healthcare data (PHI)—is a common source of security incidents and compliance violations. pii-aware-mixin solves this by providing a lightweight @phi_aware decorator and module-level helpers that automatically mask sensitive fields in logs, reprs, and serialization.

Why?

When you log a dataclass instance or its repr, sensitive fields leak unless you explicitly redact them everywhere:

@dataclass(frozen=True)
class APICredentials:
    user_id: int
    api_token: str

creds = APICredentials(user_id=1, api_token="sk-abc123xyz")
logger.info("Creds: %s", creds)  # logs: "Creds: APICredentials(user_id=1, api_token='sk-abc123xyz')"
                                  # OOPS! Token is exposed.

This library makes it one-line per field: mark sensitive fields with metadata, and let the decorator handle the rest.

from dataclasses import dataclass, field
from pii_aware_mixin import phi_aware, mask_for_logging

@phi_aware
@dataclass(frozen=True, slots=True)
class APICredentials:
    user_id: int
    api_token: str = field(metadata={"phi": True})

creds = APICredentials(user_id=1, api_token="sk-abc123xyz")

# Safe for logging:
logger.info("Creds: %s", mask_for_logging(creds))
# → {"user_id": 1, "api_token": "<phi:redacted>"}

# Safe for reprs / tracebacks:
logger.error("Error: %s", repr(creds))
# → "APICredentials(user_id=1, api_token=<phi:redacted>)"

# Full data when needed (serialization, storage):
api_payload = to_dict(creds)
# → {"user_id": 1, "api_token": "sk-abc123xyz"}

Installation

pip install pii-aware-mixin

Requires Python 3.10+.

Quick Start

1. Import the decorator and helpers

from dataclasses import dataclass, field
from pii_aware_mixin import phi_aware, mask_for_logging, to_dict

2. Decorate and mark sensitive fields

Use the @phi_aware decorator on a frozen dataclass with field(metadata={"phi": True}) on sensitive fields:

@phi_aware
@dataclass(frozen=True, slots=True)
class User:
    id: int
    api_token: str = field(metadata={"phi": True})
    email: str = field(metadata={"phi": True})
    name: str

Key difference from v0.1.0: No repr=False required. The decorator works on plain @dataclass(frozen=True, slots=True).

3. Use in your code

user = User(
    id=1,
    api_token="sk-123456",
    email="alice@example.com",
    name="Alice"
)

# Masked for logging:
print(mask_for_logging(user))
# → {"id": 1, "api_token": "<phi:redacted>", "email": "<phi:redacted>", "name": "Alice"}

# Masked for repr (safe in tracebacks, error messages):
print(repr(user))
# → User(id=1, api_token=<phi:redacted>, email=<phi:redacted>, name='Alice')

# Unredacted for storage/transmission:
print(to_dict(user))
# → {"id": 1, "api_token": "sk-123456", "email": "alice@example.com", "name": "Alice"}

API Reference

@phi_aware decorator

Adds a PHI-masking __repr__() to a frozen dataclass. Fields marked with metadata={"phi": True} are redacted in repr output.

Usage:

@phi_aware
@dataclass(frozen=True, slots=True)
class Patient:
    name: str
    ssn: str = field(metadata={"phi": True})

repr(Patient(name="Alice", ssn="123"))
# → "Patient(name='Alice', ssn=<phi:redacted>)"

mask_for_logging(instance) → dict

Returns a dict representation with PHI fields masked as <phi:redacted>.

Use case: Structured logging (JSON, CloudWatch, logging frameworks)

@phi_aware
@dataclass(frozen=True, slots=True)
class Credentials:
    username: str
    password: str = field(metadata={"phi": True})

creds = Credentials(username="alice", password="secret")
print(mask_for_logging(creds))
# → {"username": "alice", "password": "<phi:redacted>"}

to_dict(instance) → dict

Returns a full dict representation with all values unmasked. Alias for dataclasses.asdict().

Use case: API serialization, storage, transmission. Returns unredacted data.

@phi_aware
@dataclass(frozen=True, slots=True)
class APIKey:
    id: int
    key: str = field(metadata={"phi": True})

key = APIKey(id=1, key="sk-abc123")
print(to_dict(key))
# → {"id": 1, "key": "sk-abc123"}

Field Metadata

Mark a field sensitive by adding metadata={"phi": True} to field():

from dataclasses import dataclass, field

@phi_aware
@dataclass(frozen=True, slots=True)
class Credentials:
    username: str = field(metadata={"phi": True})
    password: str = field(metadata={"phi": True})
    created_at: str  # not PHI — no metadata needed

Any field without metadata or with metadata={"phi": False} is treated as non-sensitive and passes through unmasked.

Logging Integration

Pair with standard library logging for clean, safe logs:

import logging
from dataclasses import dataclass, field
from pii_aware_mixin import phi_aware, mask_for_logging

logger = logging.getLogger(__name__)

@phi_aware
@dataclass(frozen=True, slots=True)
class LoginAttempt:
    username: str
    password: str = field(metadata={"phi": True})
    ip_address: str

def handle_login(username, password, ip):
    attempt = LoginAttempt(username=username, password=password, ip_address=ip)
    logger.info("Login attempt: %s", mask_for_logging(attempt))
    # Logs: {"username": "alice", "password": "<phi:redacted>", "ip_address": "192.168.1.1"}

Mask Strategies

Currently, pii-aware-mixin uses a simple full-mask strategy: all sensitive fields become <phi:redacted>.

This is by design — it's safe, readable, and appropriate for most use cases. Custom partial-masking (e.g., "last 4 digits") can be layered on top if needed:

# Example: Custom partial masking before logging
masked = mask_for_logging(creds)
if creds.credit_card:
    masked["credit_card"] = f"****-****-****-{creds.credit_card[-4:]}"
logger.info("Payment: %s", masked)

Migration from v0.1.0

v0.2.0 is a breaking change. The mixin-inheritance API has been replaced with a cleaner decorator API.

v0.1.0 (deprecated)

from pii_aware_mixin import PiiAwareMixin, ReprMixin, ToDictMixin

@dataclass(frozen=True, slots=True, repr=False)
class User(PiiAwareMixin, ReprMixin, ToDictMixin):
    id: int
    api_token: str = field(metadata={"sensitive": True})

user.mask_for_logging()    # instance method
repr(user)                 # mixin-provided __repr__
user.to_dict()             # instance method

v0.2.0 (current)

from pii_aware_mixin import phi_aware, mask_for_logging, to_dict

@phi_aware
@dataclass(frozen=True, slots=True)
class User:
    id: int
    api_token: str = field(metadata={"phi": True})

mask_for_logging(user)  # module-level helper
repr(user)              # decorator-provided __repr__
to_dict(user)           # module-level helper

Key benefits:

  • No mixin inheritance (compatible with "no mixin DTOs" canonical pattern)
  • No repr=False required
  • Simpler, more explicit API
  • Healthcare-aligned metadata key: "phi" instead of "sensitive"

Design Principles

  • Decorator-based: Simple, non-intrusive. Works on plain frozen dataclasses.
  • Metadata-driven: No boilerplate. Mark fields, the decorator handles the rest.
  • Type-safe: Works with frozen dataclasses, slots, type hints.
  • Zero-cost: Simple, no introspection overhead.
  • Canonical: Compatible with "no mixin inheritance on data DTOs" pattern.

License

Apache 2.0 — see LICENSE file.

Contributing

This library is extracted from real-world healthcare compliance work and maintained by James Ekhator. Contributions welcome via pull requests.

See Also

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pii_aware_mixin-0.2.0.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pii_aware_mixin-0.2.0-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file pii_aware_mixin-0.2.0.tar.gz.

File metadata

  • Download URL: pii_aware_mixin-0.2.0.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pii_aware_mixin-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e89dcb3240bd79f457336fffc634aa16877acb8a2042cac3fff88c206a62905c
MD5 c2b7478ff985e22b274249e4b8d8eccb
BLAKE2b-256 2502422c84b9db52efeb23b2c55384a5ce955d9df6cb500c7875415792b2242b

See more details on using hashes here.

Provenance

The following attestation bundles were made for pii_aware_mixin-0.2.0.tar.gz:

Publisher: publish.yml on jekhator/pii-aware-mixin

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pii_aware_mixin-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pii_aware_mixin-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pii_aware_mixin-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f391785b4f45b5203c4e71ae6cac915c291c33cd7d664929e8413c4b1b7c6f41
MD5 62d70922e1e63bdf0dd7c395cf9d747c
BLAKE2b-256 ed933b0a5dfb0f3a7bd09f3fdd146f4934890f611cedb82a4dc7867ee4e7e081

See more details on using hashes here.

Provenance

The following attestation bundles were made for pii_aware_mixin-0.2.0-py3-none-any.whl:

Publisher: publish.yml on jekhator/pii-aware-mixin

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page