Skip to main content

Zero-dependency privacy-first log redaction for Python.

Project description

LogPrivacy

CI PyPI Python Versions status alpha typing typed dependencies zero license MIT

Simple by default, powerful by composition, safe by guidance.

LogPrivacy is a zero-dependency Python library that helps prevent accidental leaks of sensitive data in logs, debug output, strings, dictionaries, files, and standard Python logging records.

What it protects against

LogPrivacy detects and masks:

  • email addresses
  • passwords and API keys
  • bearer tokens and JWTs
  • generic secrets and access tokens
  • sensitive URL query parameters
  • credit card-like values (Luhn-validated)
  • IP addresses (strict mode)
  • phone-like values (strict mode)

Why LogPrivacy?

Most log leaks are not attacks. They happen because someone prints a payload, logs an exception, debugs a request, or passes a dictionary to a logger.

LogPrivacy gives you small, memorable tools that fit into your existing code without replacing your logging setup:

from logprivacy import clean, safe_print, get_safe_logger, audit, assert_clean

clean("email=john@example.com password=123456")
safe_print("token=abc123456789")
logger = get_safe_logger(__name__)
audit("Authorization: Bearer secret-token")
assert_clean("safe message")

Installation

pip install logprivacy

Which API should I use?

I want to… Use
Clean a string or structured value clean()
Print safely while debugging safe_print()
Use Python's logging module safely get_safe_logger()
Check whether a value contains sensitive data audit()
Fail a test when a log message leaks a secret assert_clean()
Sanitize a URL while keeping safe query params clean_url()
Scan or clean an old log file scan_file() / clean_file()

See docs/which-api.md for a longer guide.

Quick start

from logprivacy import clean

message = "Login failed for john@example.com with password=123456"
print(clean(message))
# Login failed for [EMAIL] with password=[SECRET]

Safe print

from logprivacy import safe_print

safe_print("User john@example.com used token=abc123456789")
# User [EMAIL] used token=[SECRET]

Safe logger

import logging
from logprivacy import get_safe_logger

logging.basicConfig(level=logging.INFO)
logger = get_safe_logger(__name__)

logger.warning("User john@example.com used password=123456")
# WARNING User [EMAIL] used password=[SECRET]

Audit before logging

from logprivacy import audit

report = audit({"password": "123456", "email": "john@example.com"})
print(report.safe)        # False
print(report.risk_level)  # "high"
print(report.categories)  # ("credential", "email")
print(report.describe())

Fail tests when logs are unsafe

from logprivacy import assert_clean

def test_log_message_has_no_sensitive_data():
    assert_clean("operation finished successfully")

def test_response_dict_is_safe():
    assert_clean({"username": "john", "status": "active"})

If sensitive data is found, assert_clean() raises LogPrivacyAssertionError.

Clean structured data

from logprivacy import clean

payload = {
    "email": "john@example.com",
    "password": "123456",
    "status": "failed",
}

print(clean(payload))
# {"email": "[EMAIL]", "password": "[SECRET]", "status": "failed"}

Clean URLs without losing useful context

from logprivacy import clean_url

url = "https://api.example.com/users?page=1&token=abc123&email=john@example.com"
print(clean_url(url))
# https://api.example.com/users?page=1&token=[SECRET]&email=[EMAIL]

Masking styles

from logprivacy import Cleaner, CleanerPolicy

Cleaner(CleanerPolicy.default(masking="placeholder"))  # [EMAIL], [SECRET]
Cleaner(CleanerPolicy.default(masking="partial"))      # j***@example.com
Cleaner(CleanerPolicy.default(masking="hash"))         # [EMAIL:855f96e9]
Input Placeholder Partial Hash
john@example.com [EMAIL] j***@example.com [EMAIL:855f96e9]
sk_live_abcdef123456 [SECRET] sk_l********3456 [SECRET:3c6e0b8a]

Policies

Policy What it detects When to use
CleanerPolicy.default() Email, credentials, tokens, secrets, URLs, credit cards General-purpose log cleaning
CleanerPolicy.strict() Everything above + IP addresses + phone numbers Sensitive environments
CleanerPolicy.web() URLs, credentials, tokens, secrets HTTP access log cleaning
CleanerPolicy.production() Strict + raises on high-risk categories CI / production safety gates

See docs/policies.md for details.

Clean log files

from logprivacy import scan_file, clean_file

report = scan_file("app.log")
print(report.describe())

clean_file("app.log", output="app.clean.log")

CLI

python -m logprivacy scan app.log
python -m logprivacy clean app.log --output app.clean.log
python -m logprivacy text "email=john@example.com password=123"

Security disclaimer

LogPrivacy reduces accidental sensitive-data exposure in logs. It is a safety net, not a DLP system. Regex-based detection can have false positives and false negatives. You should avoid logging sensitive data in the first place. LogPrivacy does not replace secret management, encryption, access control, or legal privacy review.

See docs/security-model.md for the full security model.

Development

python3 -m venv .venv
source .venv/bin/activate

python3 -m pip install --upgrade pip
python3 -m pip install -e ".[dev]"

Run all checks:

./scripts/ci.sh

Or individually:

python3 -m ruff format .
python3 -m ruff check .
python3 -m mypy src
python3 -m pytest
python3 -m build

Design goals

  1. Simple things should be simple.
  2. Advanced usage should be composable.
  3. Logs should be safe by default.
  4. Rules should be modular and easy to test.
  5. Output should be predictable and explainable.
  6. Runtime dependencies should stay at zero.
  7. Users should not need to replace their whole logging setup.
  8. Security guidance should be honest: this reduces risk, it does not replace DLP.

Status

Early development. Public API may still evolve. See CHANGELOG.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logprivacy-0.5.1.tar.gz (34.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logprivacy-0.5.1-py3-none-any.whl (34.6 kB view details)

Uploaded Python 3

File details

Details for the file logprivacy-0.5.1.tar.gz.

File metadata

  • Download URL: logprivacy-0.5.1.tar.gz
  • Upload date:
  • Size: 34.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for logprivacy-0.5.1.tar.gz
Algorithm Hash digest
SHA256 3327f1cd391e4211ffa737f55d3ed7737034be7df0461b6a52fe655eb07dc07f
MD5 75f3ca7f06235acb18695cf81899d81c
BLAKE2b-256 902629426529afc46c7c480739ee861581df2a34668038f8c8a35b7d0c067a0a

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprivacy-0.5.1.tar.gz:

Publisher: publish.yml on igors93/logprivacy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file logprivacy-0.5.1-py3-none-any.whl.

File metadata

  • Download URL: logprivacy-0.5.1-py3-none-any.whl
  • Upload date:
  • Size: 34.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for logprivacy-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b6c2301fb8a07a73a88c821462399205d91516218c74ec80af72b6bc25ee1b24
MD5 eb0ab933e8ff71c2772123728bf02192
BLAKE2b-256 cde8b4e3d259419b38b18c42318b80c41e68fc2d77bd4dcd487e5eab8088a9f5

See more details on using hashes here.

Provenance

The following attestation bundles were made for logprivacy-0.5.1-py3-none-any.whl:

Publisher: publish.yml on igors93/logprivacy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page