Skip to main content

Unified DLP scanner for SaaS sources — secret detection (trufflehog, gitleaks, native regex) plus PII detection (pleno-anonymize). API-driven content collection from GitHub, GitLab, Bitbucket, Slack, Notion, Confluence, Jira.

Project description

pleno-dlp (Python)

Unified DLP scanner for SaaS content — secrets (trufflehog / gitleaks / native regex) and PII (delegating to pleno-anonymize). One plugin model: every source (github, slack, jira, …) and every detector (native, trufflehog, gitleaks, pii) is a connector registered under pleno_dlp.connectors.*, distinguished by ConnectorSpec.role (source or detector). pip install pleno-dlp pulls one wheel exposing one console script (pleno-dlp).

The Go binary in this repo (cmd/pleno-dlp) remains for filesystem-only scans; the Python package is the path forward for SaaS.

Install

uv tool install pleno-dlp
# or
pipx install pleno-dlp

# Add the PII backend (pulls pleno-anonymize):
uv tool install 'pleno-dlp[pii]'

Usage

The CLI is connector-agnostic: knobs flow through the generic --option key=value flag (sources) and --detector-option / -D key=value (detectors). Run pleno-dlp describe <connector> for the accepted keys, types, defaults, and which ones are secrets.

# Discover what's registered
pleno-dlp list                       # everything
pleno-dlp list --role source         # SaaS sources only
pleno-dlp list --role detector       # detection backends only
pleno-dlp describe github
pleno-dlp describe trufflehog

# Secret scan over an entire GitHub org with the default native detector
GITHUB_TOKEN=ghp_... pleno-dlp scan github --option owner=plenoai

# Scan a single repo, only code, with trufflehog verification
pleno-dlp scan github \
    --option owner=plenoai --option repo=pleno-dlp \
    --option resources=code --detector trufflehog

# Issue + PR conversations only, PII detection (requires pleno-anonymize)
pleno-dlp scan github --option owner=plenoai \
    --option resources=issues,prs --detector pii \
    --pii-base-url http://localhost:8000

# SARIF output for GitHub code-scanning ingestion
pleno-dlp scan github --option owner=plenoai \
    --format sarif > findings.sarif

# Slack workspace — same shape, different source connector
pleno-dlp scan slack --token xoxb-... --option include_threads=false

Auth resolution for github: --tokenGITHUB_TOKEN env var → gh auth token. Anonymous works for public content but is rate-limited to 60 req/h. Other source connectors take their token via --token (shorthand for --option token=…) or via --option api_token=… / --option access_token=… depending on the auth mode (see describe).

Detectors

Detector Class Verifies System dep
trufflehog secret yes (per-detector) trufflehog CLI on PATH
gitleaks secret no gitleaks CLI on PATH
native secret no none — bundled regex (AWS, GitHub PAT, Slack bot, OpenAI, Anthropic)
pii PII n/a pleno-anonymize HTTP API (installed via pleno-dlp[pii] extra)

Source connectors

Each source self-describes via a ConnectorSpec (auth modes, resources, options, runtime capabilities). Today: github, gitlab, bitbucket (cloud + server), slack (xoxb / xoxp), notion, confluence (cloud + datacenter), jira (cloud + datacenter). Run pleno-dlp list --role source for the live list and pleno-dlp describe <name> for the option sheet.

Adding a new connector (source or detector)

  1. Create python/src/pleno_dlp/connectors/<name>.py.
  2. Implement the right Protocol:
    • Source: discover, fetch, discover_and_fetch, capabilities, close. Keep one httpx.AsyncClient per instance.
    • Detector: a single async def scan(self, doc: Document) -> AsyncIterator[Finding].
  3. Declare a spec: ClassVar[ConnectorSpec] = ConnectorSpec(...) with the right role (ConnectorRole.SOURCE or ConnectorRole.DETECTOR), name, kind, summary, auth_modes, resources (sources only), options (every __init__ kwarg you want operators to set), and capabilities (sources only).
  4. End the module with registry.register("<name>", <Class>).
  5. Wire the import in pleno_dlp/connectors/__init__.py.
  6. Add fixtures + tests under python/tests/connectors/test_<name>.py using httpx.MockTransport (or stdlib mocks for offline detectors).

Once the spec lands, pleno-dlp scan <source> --detector <det>, pleno-dlp list, and pleno-dlp describe all work without touching the CLI.

Release

Tag py-vX.Y.Z triggers PyPI trusted publishing via GitHub Actions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pleno_dlp-0.10.0.tar.gz (78.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pleno_dlp-0.10.0-py3-none-any.whl (98.2 kB view details)

Uploaded Python 3

File details

Details for the file pleno_dlp-0.10.0.tar.gz.

File metadata

  • Download URL: pleno_dlp-0.10.0.tar.gz
  • Upload date:
  • Size: 78.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pleno_dlp-0.10.0.tar.gz
Algorithm Hash digest
SHA256 0bf5db8ac7957c9eaf3aa92033ba752961193e6f2bcdad00748f7db18d1609fb
MD5 c7e68a93d2399b6ad46162a576fac19b
BLAKE2b-256 765f5869e8cd3f77c21d2e5d5efb2baadd425451f90143b40076ae05d14c8818

See more details on using hashes here.

Provenance

The following attestation bundles were made for pleno_dlp-0.10.0.tar.gz:

Publisher: release-py.yml on plenoai/pleno-dlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pleno_dlp-0.10.0-py3-none-any.whl.

File metadata

  • Download URL: pleno_dlp-0.10.0-py3-none-any.whl
  • Upload date:
  • Size: 98.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pleno_dlp-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e608dcdac173806aa9a064b4811a90c6a812e4106c0fae95c0072218c2428860
MD5 4a1c74c761374742b0a4007937f28c5f
BLAKE2b-256 33b30da8a15e849927da3ca948cdcd9332739cdd184276699067e7832c09eae9

See more details on using hashes here.

Provenance

The following attestation bundles were made for pleno_dlp-0.10.0-py3-none-any.whl:

Publisher: release-py.yml on plenoai/pleno-dlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page