Skip to main content

Local-first CLI for sanitizing logs, stack traces, and text before sharing.

Project description

ShareClean

CI Python License PyPI Live demo

Local-first Python CLI for sanitizing logs, stack traces, config snippets, and terminal output before you paste them into GitHub issues, support tickets, Slack, or AI chats.

ShareClean detects common sensitive values, replaces only the risky portion, and reports safe metadata without storing or printing the original secret. It makes no network calls and sends no telemetry.

Try the interactive browser playground to see the redaction rules before installing.

ShareClean browser playground demo

Browser playground shown for illustration; real workflows run locally through the CLI.

Install

With pipx:

pipx install shareclean

From a local checkout:

python -m pip install -e .

Run without installing from the repository root:

python -m shareclean --help

Quick Start

shareclean app.log
shareclean app.log --output app.cleaned.log
shareclean app.log --report
shareclean app.log --report --report-format json
shareclean app.log --check
shareclean app.log --check --fail-on severity:high
shareclean app.log --check --fail-on category:token,rule:SC004
shareclean app.log --check --ignore-for-check category:pii_email
shareclean app.log --private-ip
shareclean app.log --phone
shareclean app.log --custom-pattern "EMP-[0-9]{6}"

--check exits 1 only for findings selected by the check policy and never writes sanitized text to stdout.

Configured fail_on and ignore_for_check policies from config files, profiles, or environment variables apply only in --check mode. Normal sanitization still redacts and reports findings, but those policies do not change the exit decision unless --check is present.

Configuration

ShareClean supports committed project policy in either pyproject.toml or .shareclean.toml.

[tool.shareclean]
redact_email = true
redact_private_ip = false
redact_phone = false
redact_mac_address = false
redaction_label = "[REDACTED]"
profile = "default"
custom_patterns = [
  { name = "Employee ID", pattern = "EMP-[0-9]{6}" },
  { name = "Tenant", pattern = "tenant=(?P<value>[a-z0-9-]+)" },
]

[tool.shareclean.profiles.ci]
redact_email = true
redact_private_ip = true
fail_on = ["severity:high"]

For .shareclean.toml, omit the tool.shareclean prefix:

redact_email = true
redact_private_ip = false
redact_phone = false
redact_mac_address = false
custom_patterns = [
  { name = "Employee ID", pattern = "EMP-[0-9]{6}" },
]

[profiles.ci]
redact_private_ip = true
fail_on = ["severity:high"]

Config location:

  1. --config PATH
  2. Nearest project directory containing .shareclean.toml or a pyproject.toml with [tool.shareclean]
  3. Defaults

Auto-discovery walks upward from the current directory until the Git root or filesystem root. It uses only the nearest config directory and never merges parent configs. If .shareclean.toml and ShareClean config in pyproject.toml exist in the same selected directory, ShareClean exits 2.

Config precedence:

  1. CLI flags
  2. Environment variables
  3. Selected profile values
  4. Base project config
  5. Defaults

Environment variables:

  • SHARECLEAN_REDACT_EMAIL
  • SHARECLEAN_REDACT_PRIVATE_IP
  • SHARECLEAN_REDACT_PHONE
  • SHARECLEAN_REDACT_MAC_ADDRESS
  • SHARECLEAN_REDACTION_LABEL
  • SHARECLEAN_PROFILE
  • SHARECLEAN_FAIL_ON
  • SHARECLEAN_IGNORE_FOR_CHECK

Boolean environment values accept true, 1, yes, on, false, 0, no, and off.

Inspect effective configuration without reading input:

shareclean config show

Detection Rules

Rule ID Detector Category Severity
SC001 Key-value secret credential high
SC002 Bearer token token high
SC003 JWT-like token token high
SC004 Connection-string password connection_string critical
SC005 Email address pii_email medium
SC006 Local user path pii_path medium
SC007 Private IP address internal_network medium
SC008 PEM private-key block private_key critical
SC009 Known provider API token token high
SC010 Webhook URL token critical
SC011 URL query secret credential high
SC012 Cookie secret token high
SC013 CLI secret argument credential high
SC014 XML secret element credential high
SC015 SAML assertion token critical
SC016 Docker or Kubernetes secret credential high
SC017 Phone number pii_phone medium
SC018 MAC address pii_hardware low
SC019 Generic sensitive key-value credential high

Private IP detection is off by default; enable it with --redact-private-ip, --private-ip, or config. When enabled, it covers private IPv4 and IPv6 addresses.

Provider-aware token detection covers high-confidence shapes such as OpenAI, Anthropic, GitHub, GitLab, Hugging Face, Stripe, Slack, Telegram, SendGrid, AWS access keys, Google API keys, npm, PyPI, Docker Hub, Netlify, DigitalOcean, and Terraform Cloud tokens. Structured key-value detection preserves JSON, YAML, TOML, INI, and environment-file formatting where possible. A generic sensitive-key fallback redacts assigned values when keys contain secret-bearing segments such as password, token, api_key, or client_secret.

Phone and MAC address detection are off by default to avoid noisy matches; enable them with --redact-phone, --phone, --redact-mac-address, or config.

Custom regex rules are reported as CUSTOM001, CUSTOM002, and so on. If a custom regex defines a named group called value, only that group is redacted; otherwise the whole match is replaced.

Programmatic use:

from shareclean import add_custom_regex
from shareclean.detectors import get_rules
from shareclean.redactor import sanitize

rules = add_custom_regex(get_rules(), r"employee=(?P<value>EMP-[0-9]{6})")
result = sanitize("employee=EMP-123456", rules)

When detectors overlap on the same text range, ShareClean emits one finding using the highest-severity rule. If severities match, it uses the most specific detector.

JSON Reports

JSON reports use schema version 1.0 and do not include filenames, paths, matched values, hashes, source snippets, or masked previews.

{
  "schema_version": "1.0",
  "source": "file",
  "summary": {
    "findings": 1,
    "by_category": {
      "credential": 1
    },
    "by_severity": {
      "high": 1
    }
  },
  "findings": [
    {
      "rule_id": "SC001",
      "category": "credential",
      "severity": "high",
      "location": {
        "start": {
          "line": 1,
          "column": 10
        },
        "end": {
          "line": 1,
          "column": 27
        }
      },
      "replacement": "[REDACTED]"
    }
  ]
}

Locations are 1-based. End positions are exclusive. Columns count Unicode code points after treating CRLF as one LF newline for location purposes.

CLI Reference

usage: shareclean [-h] [--version] [--check] [--output FILE] [--report]
                  [--report-format {text,json}] [--config FILE]
                  [--profile NAME] [--redact-email] [--no-redact-email]
                  [--redact-private-ip] [--private-ip]
                  [--no-redact-private-ip] [--no-private-ip]
                  [--redact-phone] [--phone] [--no-redact-phone]
                  [--no-phone]
                  [--redact-mac-address] [--no-redact-mac-address]
                  [--redaction-label TEXT] [--fail-on SELECTORS]
                  [--ignore-for-check SELECTORS]
                  [--custom-pattern REGEX]
                  [FILE]

--no-email remains as a deprecated alias for --no-redact-email.

Exit codes:

Code Meaning
0 Completed successfully
1 Selected findings detected in --check mode
2 User, I/O, config, or selector error
3 Unexpected internal error

Safety Model

ShareClean is intentionally local and transparent:

  • No network calls
  • No cloud processing
  • No telemetry
  • No account or API key required
  • Original matched secret values are not stored in findings or reports
  • Input files are never modified in place

Coverage And Limitations

ShareClean is pattern-based. It can miss unusual formats and can redact benign text that resembles a secret. It is not a replacement for repository secret scanners, source-history scanning, or DLP systems.

The test corpus under tests/fixtures/ uses only fake values and is split into generic, cloud, database, CI/CD, SaaS, log, YAML/JSON/env, provider-token, cookie/URL/CLI, and false-positive packs. Bug reports that change detection should add a regression fixture using clearly fake data.

Development

Run the test suite:

python -m unittest discover -s tests -v

Run packaging checks:

python -m compileall -q src tests
python -m build
python -m twine check dist/*

License

ShareClean is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shareclean-0.3.3.tar.gz (50.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shareclean-0.3.3-py3-none-any.whl (25.6 kB view details)

Uploaded Python 3

File details

Details for the file shareclean-0.3.3.tar.gz.

File metadata

  • Download URL: shareclean-0.3.3.tar.gz
  • Upload date:
  • Size: 50.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shareclean-0.3.3.tar.gz
Algorithm Hash digest
SHA256 63469732cfc09bc1e5abdad1ad899310ad65705386a566722ad854c51633b973
MD5 803d38afb63ef5b8b812c79c20c21ba0
BLAKE2b-256 f663b8c971625c4f6cf73aefa632d5a3f0b783953c60520e4f5cfc31cd055abc

See more details on using hashes here.

Provenance

The following attestation bundles were made for shareclean-0.3.3.tar.gz:

Publisher: release.yml on OmarH-creator/ShareClean

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file shareclean-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: shareclean-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 25.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shareclean-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 4fed44b0f0964b24ce3de5618b6c93a02ea1cea0a654a64f2ad4919b83499478
MD5 f0b954a30c9e7e3846e612e3f476bee8
BLAKE2b-256 6c60745bd4aecb2b2384e47b06d4093a2f02772ae3a14ed830e2146c55bfa0d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for shareclean-0.3.3-py3-none-any.whl:

Publisher: release.yml on OmarH-creator/ShareClean

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page