Skip to main content

Unified DLP scanner for SaaS sources — secret detection (trufflehog, gitleaks, native regex) plus PII detection (pleno-anonymize). API-driven content collection from GitHub, GitLab, Bitbucket, Slack, Notion, Confluence, Jira.

Project description

pleno-dlp (Python)

Unified DLP scanner for SaaS content — secrets (trufflehog / gitleaks / native regex) and PII (delegating to pleno-anonymize).

A connector models a SaaS provider — github, gitlab, bitbucket, slack, notion, confluence, jira — and walks its content through the provider's API. A detection engine turns text into findings; the four built-ins (native, trufflehog, gitleaks, pii) live under pleno_dlp.engines and apply equally to any connector's output. Every connector self-describes via ConnectorSpec.capabilities (SOURCE, optional VERIFY / REVOKE for secret-lifecycle ops).

pip install pleno-dlp pulls one wheel exposing one console script (pleno-dlp). The Go binary in this repo (cmd/pleno-dlp) remains for filesystem-only scans; the Python package is the path forward for SaaS.

Install

uv tool install pleno-dlp
# or
pipx install pleno-dlp

# Add the PII backend (pulls pleno-anonymize):
uv tool install 'pleno-dlp[pii]'

Usage

The CLI is connector-agnostic: knobs flow through the generic --option key=value flag, and the detection engine is picked with --engine. Run pleno-dlp describe <connector> for the accepted keys, types, defaults, and which ones are secrets.

# Discover what's registered
pleno-dlp list                              # connectors + engines
pleno-dlp list --capability verify          # connectors with VERIFY
pleno-dlp describe github

# Secret scan over an entire GitHub org with the default native engine
GITHUB_TOKEN=ghp_... pleno-dlp scan github --option owner=plenoai

# Scan a single repo, only code, with trufflehog verification
pleno-dlp scan github \
    --option owner=plenoai --option repo=pleno-dlp \
    --option resources=code --engine trufflehog

# Issue + PR conversations only, PII detection (requires pleno-anonymize)
pleno-dlp scan github --option owner=plenoai \
    --option resources=issues,prs --engine pii \
    --pii-base-url http://localhost:8000

# SARIF output for GitHub code-scanning ingestion
pleno-dlp scan github --option owner=plenoai \
    --format sarif > findings.sarif

# Slack workspace — same shape, different source connector
pleno-dlp scan slack --token xoxb-... --option include_threads=false

# Confirm a leaked github PAT is still live
pleno-dlp verify github --token ghp_…

Auth resolution for github: --tokenGITHUB_TOKEN env var → gh auth token. Anonymous works for public content but is rate-limited to 60 req/h. Other source connectors take their token via --token (shorthand for --option token=…) or via --option api_token=… / --option access_token=… depending on the auth mode (see describe).

Detection engines

Engines are not connectors — they are stateless utilities that turn a Document.text into Finding\s. Pick one with --engine.

Engine Class Verifies System dep
trufflehog secret yes (per-detector) trufflehog CLI on PATH
gitleaks secret no gitleaks CLI on PATH
native secret no none — bundled regex (AWS, GitHub PAT, Slack bot, OpenAI, Anthropic)
pii PII n/a pleno-anonymize HTTP API (installed via pleno-dlp[pii] extra)

Source connectors

Each connector self-describes via a ConnectorSpec (auth modes, resources, options, runtime capabilities). Today: github, gitlab, bitbucket (cloud + server), slack (xoxb / xoxp), notion, confluence (cloud + datacenter), jira (cloud + datacenter). Run pleno-dlp list for the live list and pleno-dlp describe <name> for the option sheet.

Capabilities

A connector advertises one or more capabilities:

  • Capability.SOURCE — implements the Connector Protocol (discover / fetch / capabilities). Every shipped connector has this.
  • Capability.VERIFY — implements the Verifier Protocol (verify(secret) -> VerifyResult). Today: github (probes GET /user).
  • Capability.REVOKE — implements the Revoker Protocol (revoke(secret) -> RevokeResult). Reserved; no built-in connector has this yet — providers without a programmatic revoke endpoint should leave it unset and document the manual rotation flow.

pleno-dlp verify <connector> --token … exercises VERIFY. Exit codes: 0 = LIVE, 1 = REVOKED, 2 = UNKNOWN/unsupported.

Adding a new connector

  1. Create python/src/pleno_dlp/connectors/<name>.py.
  2. Implement at least the Connector Protocol (discover, fetch, discover_and_fetch, capabilities, close). Keep one httpx.AsyncClient per instance. Optionally add verify(secret) / revoke(secret) for lifecycle support.
  3. Declare a spec: ClassVar[ConnectorSpec] = ConnectorSpec(...) with name, kind, summary, capabilities (frozenset of the Capability values you implement; defaults to {SOURCE}), auth_modes, resources, options (every __init__ kwarg you want operators to set), and runtime (a Capabilities describing incremental / streaming / concurrency).
  4. End the module with registry.register("<name>", <Class>).
  5. Wire the import in pleno_dlp/connectors/__init__.py.
  6. Add fixtures + tests under python/tests/connectors/test_<name>.py using httpx.MockTransport.

Once the spec lands, pleno-dlp scan <name> --engine <engine>, pleno-dlp verify <name>, pleno-dlp list, and pleno-dlp describe all work without touching the CLI.

Release

Tag py-vX.Y.Z triggers PyPI trusted publishing via GitHub Actions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pleno_dlp-0.11.0.tar.gz (81.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pleno_dlp-0.11.0-py3-none-any.whl (102.4 kB view details)

Uploaded Python 3

File details

Details for the file pleno_dlp-0.11.0.tar.gz.

File metadata

  • Download URL: pleno_dlp-0.11.0.tar.gz
  • Upload date:
  • Size: 81.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pleno_dlp-0.11.0.tar.gz
Algorithm Hash digest
SHA256 7af78fca7687b23033834dc61336ccec0ba012e6179d98f331094a0a3dd00f31
MD5 22fc43ee9f82c0d326ee89a395d775ed
BLAKE2b-256 f1686465a16a84b7b3b22316bf5e72eedc8ed4170ed5cbaca0325b0c14fbe42c

See more details on using hashes here.

Provenance

The following attestation bundles were made for pleno_dlp-0.11.0.tar.gz:

Publisher: release-py.yml on plenoai/pleno-dlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pleno_dlp-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: pleno_dlp-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 102.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pleno_dlp-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 655fa43fae919dc5145c2a2a1625f72212c248d64303acb1e54b9d98b42db4f8
MD5 a04e56bc62f81a373ff06d1b34b0b676
BLAKE2b-256 629fd0c48ed85bd224e024aefec61cb6920d4394824d43772ba53f56f7767d77

See more details on using hashes here.

Provenance

The following attestation bundles were made for pleno_dlp-0.11.0-py3-none-any.whl:

Publisher: release-py.yml on plenoai/pleno-dlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page