Skip to main content

Sensitive-data classification and masking for Python frozen dataclasses

Project description

sensitivity-mixin

Decorator-based sensitivity classification and masking for Python frozen dataclasses.

Accidentally logging sensitive data—API tokens, passwords, session IDs, PII, healthcare data (PHI), credit card numbers, secrets—is a common source of security incidents and compliance violations. sensitivity-mixin solves this by providing a lightweight @sensitive decorator and taxonomy-based classification that automatically masks sensitive fields in logs and reprs.

Why?

When you log a dataclass instance or its repr, sensitive fields leak unless you explicitly redact them everywhere:

@dataclass(frozen=True)
class APICredentials:
    user_id: int
    api_token: str

creds = APICredentials(user_id=1, api_token="sk-abc123xyz")
logger.info("Creds: %s", creds)  # logs: "Creds: APICredentials(user_id=1, api_token='sk-abc123xyz')"
                                  # OOPS! Token is exposed.

This library makes it one-line per field: mark sensitive fields with a decorator, and let the classifier introspect and mask them automatically.

from dataclasses import dataclass, field
from sensitivity_mixin import sensitive, classify

@sensitive
@dataclass(frozen=True, slots=True)
class APICredentials:
    user_id: int
    api_token: str = field(metadata={"sensitivity": "SECRET"})

creds = APICredentials(user_id=1, api_token="sk-abc123xyz")

# Introspect sensitivity:
profile = classify(creds)
# → SensitivityProfile(classes=(('api_token', Sensitivity.SECRET),))

# Safe for reprs / tracebacks:
logger.error("Error: %s", repr(creds))
# → "APICredentials(user_id=1, api_token=***)"

Installation

pip install sensitivity-mixin

or with uv:

uv add sensitivity-mixin

Requires Python 3.11+.

Quick Start

1. Import the decorator and classifier

from dataclasses import dataclass, field
from sensitivity_mixin import sensitive, classify, Sensitivity

2. Decorate and mark sensitive fields

Use the @sensitive decorator on a frozen dataclass and tag fields with a sensitivity taxonomy:

@sensitive
@dataclass(frozen=True, slots=True)
class User:
    id: int
    api_token: str = field(metadata={"sensitivity": "SECRET"})
    email: str = field(metadata={"sensitivity": "PII"})
    ssn: str = field(metadata={"sensitivity": "PHI"})
    name: str

Supported sensitivity tags: "PHI" (healthcare data), "PII" (personal info), "PCI" (payment card data), "SECRET" (credentials/tokens), or omit for non-sensitive.

3. Use in your code

user = User(
    id=1,
    api_token="sk-123456",
    email="alice@example.com",
    ssn="123-45-6789",
    name="Alice"
)

# Introspect sensitivity:
profile = classify(user)
print(profile.has(Sensitivity.SECRET))  # → True
print(profile.fields_of(Sensitivity.PII))  # → ('email',)

# Masked for repr (safe in tracebacks, error messages):
print(repr(user))
# → User(id=1, api_token=***, email=***, ssn=***, name='Alice')

4. Use policy-driven masking (optional)

Wire per-class policies to customize masking placeholders:

from sensitivity_mixin import SensitiveDecorator
from sensitivity_mixin.decorators.classes.secret_aware import SecretPolicyAware
from sensitivity_mixin.decorators.classes.compliance import Compliance

secret_policy = SecretPolicyAware(
    compliance=Compliance.NONE,
    detection_hints=("api_token", "secret", "token", "password"),
    placeholder="[REDACTED]"
)

decorator = SensitiveDecorator(policies=((Sensitivity.SECRET, secret_policy),))

@decorator
@dataclass(frozen=True, slots=True)
class ApiClient:
    client_id: str
    api_token: str = field(metadata={"sensitivity": "SECRET"})

client = ApiClient(client_id="c1", api_token="sk-secret")
print(repr(client))
# → ApiClient(client_id='c1', api_token=[REDACTED])

API Reference

@sensitive decorator

Adds a sensitivity-aware __repr__() to a frozen dataclass. Fields marked with a sensitivity tag in metadata are redacted in repr output using a default placeholder (***).

Usage:

@sensitive
@dataclass(frozen=True, slots=True)
class Patient:
    name: str
    ssn: str = field(metadata={"sensitivity": "PHI"})

repr(Patient(name="Alice", ssn="123"))
# → "Patient(name='Alice', ssn=***)"

With policies:

from sensitivity_mixin import SensitiveDecorator
from sensitivity_mixin.decorators.classes.phi_aware import PhiPolicyAware
from sensitivity_mixin.decorators.classes.compliance import Compliance

phi_policy = PhiPolicyAware(
    compliance=Compliance.HIPAA,
    detection_hints=("ssn", "name"),
    placeholder="[REDACTED]"
)

decorator = SensitiveDecorator(policies=((Sensitivity.PHI, phi_policy),))

@decorator
@dataclass(frozen=True, slots=True)
class Patient:
    name: str
    ssn: str = field(metadata={"sensitivity": "PHI"})

repr(Patient(name="Alice", ssn="123"))
# → "Patient(name=[REDACTED], ssn=[REDACTED])"

classify(instance) → SensitivityProfile

Introspects a dataclass and returns a SensitivityProfile documenting all sensitivity-tagged fields.

Use case: Compliance auditing, field-level sensitivity introspection

@sensitive
@dataclass(frozen=True, slots=True)
class Credentials:
    username: str
    password: str = field(metadata={"sensitivity": "SECRET"})
    api_key: str = field(metadata={"sensitivity": "SECRET"})

creds = Credentials(username="alice", password="secret", api_key="sk-123")
profile = classify(creds)
# → SensitivityProfile(classes=(('password', Sensitivity.SECRET), ('api_key', Sensitivity.SECRET)))

# Query the profile:
print(profile.has(Sensitivity.SECRET))  # → True
print(profile.fields_of(Sensitivity.SECRET))  # → ('password', 'api_key')
print(profile.sensitivity_of('username'))  # → None (unclassified)

SensitivityProfile provides:

  • classes: tuple[tuple[str, Sensitivity], ...] — field name → sensitivity mapping
  • has(kind: Sensitivity) → bool — check for a sensitivity class
  • fields_of(kind: Sensitivity) → tuple[str, ...] — get field names of a class
  • sensitivity_of(name: str) → Sensitivity | None — get the class of a field
  • is_empty → bool — whether any fields are tagged

Field Metadata

Mark a field sensitive by adding metadata={"sensitivity": "<TAG>"} to field():

from dataclasses import dataclass, field
from sensitivity_mixin import sensitive

@sensitive
@dataclass(frozen=True, slots=True)
class Credentials:
    username: str
    password: str = field(metadata={"sensitivity": "SECRET"})
    email: str = field(metadata={"sensitivity": "PII"})
    created_at: str  # not sensitive — no metadata needed

Supported tags:

  • "PHI" — Protected Health Information (healthcare/medical records)
  • "PII" — Personally Identifiable Information (names, emails, SSNs)
  • "PCI" — Payment Card Industry data (credit card numbers)
  • "SECRET" — API tokens, passwords, secrets
  • Omitted — non-sensitive (passes through unmasked)

Any field without metadata or with metadata={"sensitivity": None} is treated as non-sensitive and passes through unmasked.

Security Boundary: What This Does and Does NOT Protect

@sensitive is a repr-layer masking tool, not a complete confidentiality boundary. It masks sensitive fields when you log or print the object itself, but does not protect against direct field access or serialization bypass.

Protected (Repr Layer Only)

  • repr(obj) — sensitive fields masked
  • str(obj) / print(obj) — uses masked repr
  • ✓ Logging the object: logger.info("Object: %s", obj) — masked
  • ✓ F-string with object: f"Object: {obj}" — masked

NOT Protected (Bypass Methods)

  • ✗ Direct field access: obj.api_token returns the full unmasked value
  • dataclasses.asdict(obj) returns a dict with full unmasked values
  • json.dumps(asdict(obj)) contains full unmasked values in JSON
  • ✗ Logging a field directly: logger.info(f"Token: {obj.api_token}") exposes the full value
  • ✗ Attribute introspection: getattr(obj, 'api_token') returns full unmasked value
  • ✗ Untagged fields are not masked — classification is explicit/opt-in

Example: Correct and Incorrect Usage

from dataclasses import dataclass, field
from sensitivity_mixin import sensitive
import logging

logger = logging.getLogger(__name__)

@sensitive
@dataclass(frozen=True, slots=True)
class APIKey:
    name: str
    secret: str = field(metadata={"sensitivity": "SECRET"})

key = APIKey(name="prod-key", secret="sk-abc123xyz")

# ✓ SAFE: logging the object uses masked repr
logger.info("API Key: %s", key)
# Output: "API Key: APIKey(name='prod-key', secret=<sensitive:redacted>)"

# ✗ UNSAFE: logging a field directly bypasses the decorator
logger.warning("Secret: %s", key.secret)
# Output: "Secret: sk-abc123xyz"  ← FULL VALUE EXPOSED!

# ✗ UNSAFE: serializing with asdict() bypasses the decorator
from dataclasses import asdict
logger.debug("Data: %s", asdict(key))
# Output: "Data: {'name': 'prod-key', 'secret': 'sk-abc123xyz'}"  ← FULL VALUES EXPOSED!

Use case: @sensitive is ideal for DTOs at the logging boundary. Keep sensitive fields wrapped in the dataclass; avoid field-level logging. For applications requiring stronger confidentiality guarantees, apply field-level masking at the serialization boundary or use dedicated encryption libraries.

Logging Integration

Pair with standard library logging for clean, safe logs:

import logging
from dataclasses import dataclass, field
from sensitivity_mixin import sensitive

logger = logging.getLogger(__name__)

@sensitive
@dataclass(frozen=True, slots=True)
class LoginAttempt:
    username: str
    password: str = field(metadata={"sensitivity": "SECRET"})
    ip_address: str

def handle_login(username, password, ip):
    attempt = LoginAttempt(username=username, password=password, ip_address=ip)
    logger.info("Login attempt: %s", repr(attempt))
    # Logs: "LoginAttempt(username='alice', password=<sensitive:redacted>, ip_address='192.168.1.1')"

Mask Strategies

By default, @sensitive masks all sensitive fields with *** (DEFAULT_PLACEHOLDER).

For customized masking, instantiate policy value objects and wire them into SensitiveDecorator:

from sensitivity_mixin import Sensitivity, SensitiveDecorator
from sensitivity_mixin.decorators.classes.secret_aware import SecretPolicyAware
from sensitivity_mixin.decorators.classes.compliance import Compliance

secret_policy = SecretPolicyAware(
    compliance=Compliance.NONE,
    detection_hints=("api_key", "secret", "token"),
    placeholder="***REDACTED***"
)

decorator = SensitiveDecorator(policies=((Sensitivity.SECRET, secret_policy),))

@decorator
@dataclass(frozen=True, slots=True)
class Config:
    api_key: str = field(metadata={"sensitivity": "SECRET"})

repr(Config(api_key="sk-123"))
# → "Config(api_key=***REDACTED***)"

See docs/apps/decorators/policies.md for policy customization details.

Migration from Earlier Versions

v0.3.0 introduces a taxonomy-driven architecture with broadened sensitivity classification.

Earlier versions (v0.1, v0.2)

from pii_aware_mixin import phi_aware

@phi_aware
@dataclass(frozen=True, slots=True)
class User:
    id: int
    api_token: str = field(metadata={"phi": True})

v0.3.0 (current)

from sensitivity_mixin import sensitive, classify

@sensitive
@dataclass(frozen=True, slots=True)
class User:
    id: int
    api_token: str = field(metadata={"sensitivity": "SECRET"})
    email: str = field(metadata={"sensitivity": "PII"})

profile = classify(user)  # introspect sensitivity

Key improvements:

  • Broadened taxonomy: PHI, PII, PCI, SECRET (not just phi)
  • Classification introspection: classify() returns a SensitivityProfile
  • Per-class policy value objects for specialized masking customization
  • Foundation for compliance-aware field governance

Design Principles

  • Decorator-based: Simple, non-intrusive. Works on plain frozen dataclasses.
  • Taxonomy-driven: Classify sensitivity at the field level: PHI, PII, PCI, or SECRET.
  • Introspectable: classify() exposes field-level sensitivity for compliance audits.
  • Type-safe: Works with frozen dataclasses, slots, type hints.
  • Zero-cost: Minimal introspection overhead at decoration time.
  • Canonical: Compatible with "no mixin inheritance on data DTOs" pattern.

License

Apache 2.0 — see LICENSE file.

Contributing

This library is maintained by James Ekhator. Contributions welcome via pull requests.

See Also

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sensitivity_mixin-0.3.0.tar.gz (77.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sensitivity_mixin-0.3.0-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file sensitivity_mixin-0.3.0.tar.gz.

File metadata

  • Download URL: sensitivity_mixin-0.3.0.tar.gz
  • Upload date:
  • Size: 77.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Amazon Linux","version":"2023","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for sensitivity_mixin-0.3.0.tar.gz
Algorithm Hash digest
SHA256 82e61ac98c323f7184f4cf712dd2721bc5b4d42ea3cf241864b4d497fb89c09f
MD5 241858da2b153069e75d36842b26b3f8
BLAKE2b-256 5e0b2dea1ebfd5b92847aac8e192b00cd8e4e15e20e6a3035c7721e75501b7c7

See more details on using hashes here.

File details

Details for the file sensitivity_mixin-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: sensitivity_mixin-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 20.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Amazon Linux","version":"2023","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for sensitivity_mixin-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe1bd8f80477f2e33c653d2045f615634f056dbb7d19d462b2e96ff0e5457006
MD5 07a43a3a55d8bdf40c290627bee303f0
BLAKE2b-256 fd206d49945faee67f3b1fe25f9dde3b6440135e8c2bb5876963af0e84ff3ae5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page