Sensitive-data classification and masking for Python frozen dataclasses
Project description
sensitivity-mixin
Decorator-based sensitivity classification and masking for Python frozen dataclasses.
Accidentally logging sensitive data—API tokens, passwords, session IDs, PII, healthcare data (PHI), credit card numbers, secrets—is a common source of security incidents and compliance violations. sensitivity-mixin solves this by providing a lightweight @sensitive decorator and taxonomy-based classification that automatically masks sensitive fields in logs and reprs.
Why?
When you log a dataclass instance or its repr, sensitive fields leak unless you explicitly redact them everywhere:
@dataclass(frozen=True)
class APICredentials:
user_id: int
api_token: str
creds = APICredentials(user_id=1, api_token="sk-abc123xyz")
logger.info("Creds: %s", creds) # logs: "Creds: APICredentials(user_id=1, api_token='sk-abc123xyz')"
# OOPS! Token is exposed.
This library makes it one-line per field: mark sensitive fields with a decorator, and let the classifier introspect and mask them automatically.
from dataclasses import dataclass, field
from sensitivity_mixin import sensitive, classify
@sensitive
@dataclass(frozen=True, slots=True)
class APICredentials:
user_id: int
api_token: str = field(metadata={"sensitivity": "SECRET"})
creds = APICredentials(user_id=1, api_token="sk-abc123xyz")
# Introspect sensitivity:
profile = classify(creds)
# → SensitivityProfile(classes=(('api_token', Sensitivity.SECRET),))
# Safe for reprs / tracebacks:
logger.error("Error: %s", repr(creds))
# → "APICredentials(user_id=1, api_token=***)"
Installation
pip install sensitivity-mixin
or with uv:
uv add sensitivity-mixin
Requires Python 3.11+.
Quick Start
1. Import the decorator and classifier
from dataclasses import dataclass, field
from sensitivity_mixin import sensitive, classify, Sensitivity
2. Decorate and mark sensitive fields
Use the @sensitive decorator on a frozen dataclass and tag fields with a sensitivity taxonomy:
@sensitive
@dataclass(frozen=True, slots=True)
class User:
id: int
api_token: str = field(metadata={"sensitivity": "SECRET"})
email: str = field(metadata={"sensitivity": "PII"})
ssn: str = field(metadata={"sensitivity": "PHI"})
name: str
Supported sensitivity tags: "PHI" (healthcare data), "PII" (personal info), "PCI" (payment card data), "SECRET" (credentials/tokens), or omit for non-sensitive.
3. Use in your code
user = User(
id=1,
api_token="sk-123456",
email="alice@example.com",
ssn="123-45-6789",
name="Alice"
)
# Introspect sensitivity:
profile = classify(user)
print(profile.has(Sensitivity.SECRET)) # → True
print(profile.fields_of(Sensitivity.PII)) # → ('email',)
# Masked for repr (safe in tracebacks, error messages):
print(repr(user))
# → User(id=1, api_token=***, email=***, ssn=***, name='Alice')
4. Use policy-driven masking (optional)
Wire per-class policies to customize masking placeholders:
from sensitivity_mixin import SensitiveDecorator
from sensitivity_mixin.decorators.classes.secret_aware import SecretPolicyAware
from sensitivity_mixin.decorators.classes.compliance import Compliance
secret_policy = SecretPolicyAware(
compliance=Compliance.NONE,
detection_hints=("api_token", "secret", "token", "password"),
placeholder="[REDACTED]"
)
decorator = SensitiveDecorator(policies=((Sensitivity.SECRET, secret_policy),))
@decorator
@dataclass(frozen=True, slots=True)
class ApiClient:
client_id: str
api_token: str = field(metadata={"sensitivity": "SECRET"})
client = ApiClient(client_id="c1", api_token="sk-secret")
print(repr(client))
# → ApiClient(client_id='c1', api_token=[REDACTED])
API Reference
@sensitive decorator
Adds a sensitivity-aware __repr__() to a frozen dataclass. Fields marked with a sensitivity tag in metadata are redacted in repr output using a default placeholder (***).
Usage:
@sensitive
@dataclass(frozen=True, slots=True)
class Patient:
name: str
ssn: str = field(metadata={"sensitivity": "PHI"})
repr(Patient(name="Alice", ssn="123"))
# → "Patient(name='Alice', ssn=***)"
With policies:
from sensitivity_mixin import SensitiveDecorator
from sensitivity_mixin.decorators.classes.phi_aware import PhiPolicyAware
from sensitivity_mixin.decorators.classes.compliance import Compliance
phi_policy = PhiPolicyAware(
compliance=Compliance.HIPAA,
detection_hints=("ssn", "name"),
placeholder="[REDACTED]"
)
decorator = SensitiveDecorator(policies=((Sensitivity.PHI, phi_policy),))
@decorator
@dataclass(frozen=True, slots=True)
class Patient:
name: str
ssn: str = field(metadata={"sensitivity": "PHI"})
repr(Patient(name="Alice", ssn="123"))
# → "Patient(name=[REDACTED], ssn=[REDACTED])"
classify(instance) → SensitivityProfile
Introspects a dataclass and returns a SensitivityProfile documenting all sensitivity-tagged fields.
Use case: Compliance auditing, field-level sensitivity introspection
@sensitive
@dataclass(frozen=True, slots=True)
class Credentials:
username: str
password: str = field(metadata={"sensitivity": "SECRET"})
api_key: str = field(metadata={"sensitivity": "SECRET"})
creds = Credentials(username="alice", password="secret", api_key="sk-123")
profile = classify(creds)
# → SensitivityProfile(classes=(('password', Sensitivity.SECRET), ('api_key', Sensitivity.SECRET)))
# Query the profile:
print(profile.has(Sensitivity.SECRET)) # → True
print(profile.fields_of(Sensitivity.SECRET)) # → ('password', 'api_key')
print(profile.sensitivity_of('username')) # → None (unclassified)
SensitivityProfile provides:
classes: tuple[tuple[str, Sensitivity], ...]— field name → sensitivity mappinghas(kind: Sensitivity) → bool— check for a sensitivity classfields_of(kind: Sensitivity) → tuple[str, ...]— get field names of a classsensitivity_of(name: str) → Sensitivity | None— get the class of a fieldis_empty → bool— whether any fields are tagged
Field Metadata
Mark a field sensitive by adding metadata={"sensitivity": "<TAG>"} to field():
from dataclasses import dataclass, field
from sensitivity_mixin import sensitive
@sensitive
@dataclass(frozen=True, slots=True)
class Credentials:
username: str
password: str = field(metadata={"sensitivity": "SECRET"})
email: str = field(metadata={"sensitivity": "PII"})
created_at: str # not sensitive — no metadata needed
Supported tags:
"PHI"— Protected Health Information (healthcare/medical records)"PII"— Personally Identifiable Information (names, emails, SSNs)"PCI"— Payment Card Industry data (credit card numbers)"SECRET"— API tokens, passwords, secrets- Omitted — non-sensitive (passes through unmasked)
Any field without metadata or with metadata={"sensitivity": None} is treated as non-sensitive and passes through unmasked.
Security Boundary: What This Does and Does NOT Protect
@sensitive is a repr-layer masking tool, not a complete confidentiality boundary. It masks sensitive fields when you log or print the object itself, but does not protect against direct field access or serialization bypass.
Protected (Repr Layer Only)
- ✓
repr(obj)— sensitive fields masked - ✓
str(obj)/print(obj)— uses masked repr - ✓ Logging the object:
logger.info("Object: %s", obj)— masked - ✓ F-string with object:
f"Object: {obj}"— masked
NOT Protected (Bypass Methods)
- ✗ Direct field access:
obj.api_tokenreturns the full unmasked value - ✗
dataclasses.asdict(obj)returns a dict with full unmasked values - ✗
json.dumps(asdict(obj))contains full unmasked values in JSON - ✗ Logging a field directly:
logger.info(f"Token: {obj.api_token}")exposes the full value - ✗ Attribute introspection:
getattr(obj, 'api_token')returns full unmasked value - ✗ Untagged fields are not masked — classification is explicit/opt-in
Example: Correct and Incorrect Usage
from dataclasses import dataclass, field
from sensitivity_mixin import sensitive
import logging
logger = logging.getLogger(__name__)
@sensitive
@dataclass(frozen=True, slots=True)
class APIKey:
name: str
secret: str = field(metadata={"sensitivity": "SECRET"})
key = APIKey(name="prod-key", secret="sk-abc123xyz")
# ✓ SAFE: logging the object uses masked repr
logger.info("API Key: %s", key)
# Output: "API Key: APIKey(name='prod-key', secret=<sensitive:redacted>)"
# ✗ UNSAFE: logging a field directly bypasses the decorator
logger.warning("Secret: %s", key.secret)
# Output: "Secret: sk-abc123xyz" ← FULL VALUE EXPOSED!
# ✗ UNSAFE: serializing with asdict() bypasses the decorator
from dataclasses import asdict
logger.debug("Data: %s", asdict(key))
# Output: "Data: {'name': 'prod-key', 'secret': 'sk-abc123xyz'}" ← FULL VALUES EXPOSED!
Use case: @sensitive is ideal for DTOs at the logging boundary. Keep sensitive fields wrapped in the dataclass; avoid field-level logging. For applications requiring stronger confidentiality guarantees, apply field-level masking at the serialization boundary or use dedicated encryption libraries.
Logging Integration
Pair with standard library logging for clean, safe logs:
import logging
from dataclasses import dataclass, field
from sensitivity_mixin import sensitive
logger = logging.getLogger(__name__)
@sensitive
@dataclass(frozen=True, slots=True)
class LoginAttempt:
username: str
password: str = field(metadata={"sensitivity": "SECRET"})
ip_address: str
def handle_login(username, password, ip):
attempt = LoginAttempt(username=username, password=password, ip_address=ip)
logger.info("Login attempt: %s", repr(attempt))
# Logs: "LoginAttempt(username='alice', password=<sensitive:redacted>, ip_address='192.168.1.1')"
Mask Strategies
By default, @sensitive masks all sensitive fields with *** (DEFAULT_PLACEHOLDER).
For customized masking, instantiate policy value objects and wire them into SensitiveDecorator:
from sensitivity_mixin import Sensitivity, SensitiveDecorator
from sensitivity_mixin.decorators.classes.secret_aware import SecretPolicyAware
from sensitivity_mixin.decorators.classes.compliance import Compliance
secret_policy = SecretPolicyAware(
compliance=Compliance.NONE,
detection_hints=("api_key", "secret", "token"),
placeholder="***REDACTED***"
)
decorator = SensitiveDecorator(policies=((Sensitivity.SECRET, secret_policy),))
@decorator
@dataclass(frozen=True, slots=True)
class Config:
api_key: str = field(metadata={"sensitivity": "SECRET"})
repr(Config(api_key="sk-123"))
# → "Config(api_key=***REDACTED***)"
See docs/apps/decorators/policies.md for policy customization details.
Migration from Earlier Versions
v0.3.0 introduces a taxonomy-driven architecture with broadened sensitivity classification.
Earlier versions (v0.1, v0.2)
from pii_aware_mixin import phi_aware
@phi_aware
@dataclass(frozen=True, slots=True)
class User:
id: int
api_token: str = field(metadata={"phi": True})
v0.3.0 (current)
from sensitivity_mixin import sensitive, classify
@sensitive
@dataclass(frozen=True, slots=True)
class User:
id: int
api_token: str = field(metadata={"sensitivity": "SECRET"})
email: str = field(metadata={"sensitivity": "PII"})
profile = classify(user) # introspect sensitivity
Key improvements:
- Broadened taxonomy:
PHI,PII,PCI,SECRET(not justphi) - Classification introspection:
classify()returns aSensitivityProfile - Per-class policy value objects for specialized masking customization
- Foundation for compliance-aware field governance
Design Principles
- Decorator-based: Simple, non-intrusive. Works on plain frozen dataclasses.
- Taxonomy-driven: Classify sensitivity at the field level: PHI, PII, PCI, or SECRET.
- Introspectable:
classify()exposes field-level sensitivity for compliance audits. - Type-safe: Works with frozen dataclasses, slots, type hints.
- Zero-cost: Minimal introspection overhead at decoration time.
- Canonical: Compatible with "no mixin inheritance on data DTOs" pattern.
License
Apache 2.0 — see LICENSE file.
Contributing
This library is maintained by James Ekhator. Contributions welcome via pull requests.
See Also
- dataclasses — Python standard library
- frozen dataclasses — immutable, hashable
- slots — memory-efficient (Python 3.10+)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sensitivity_mixin-0.3.0.tar.gz.
File metadata
- Download URL: sensitivity_mixin-0.3.0.tar.gz
- Upload date:
- Size: 77.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Amazon Linux","version":"2023","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
82e61ac98c323f7184f4cf712dd2721bc5b4d42ea3cf241864b4d497fb89c09f
|
|
| MD5 |
241858da2b153069e75d36842b26b3f8
|
|
| BLAKE2b-256 |
5e0b2dea1ebfd5b92847aac8e192b00cd8e4e15e20e6a3035c7721e75501b7c7
|
File details
Details for the file sensitivity_mixin-0.3.0-py3-none-any.whl.
File metadata
- Download URL: sensitivity_mixin-0.3.0-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Amazon Linux","version":"2023","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe1bd8f80477f2e33c653d2045f615634f056dbb7d19d462b2e96ff0e5457006
|
|
| MD5 |
07a43a3a55d8bdf40c290627bee303f0
|
|
| BLAKE2b-256 |
fd206d49945faee67f3b1fe25f9dde3b6440135e8c2bb5876963af0e84ff3ae5
|