Skip to main content

Unified DLP scanner for SaaS sources — secret detection (trufflehog, gitleaks, native regex) plus PII detection (pleno-anonymize). Bundles saas-retriever for API-driven content collection: GitHub, GitLab, Bitbucket, Slack, Notion, Confluence, Jira.

Project description

pleno-dlp (Python)

Unified DLP scanner for SaaS content — secrets (trufflehog / gitleaks / native regex) and PII (delegating to pleno-anonymize). The SaaS source layer (formerly the standalone saas-retriever package) is vendored in-tree from 0.7.0: pip install pleno-dlp pulls one wheel that exposes both the pleno-dlp and the saas-retriever console scripts and lets you from saas_retriever import … without any extra dependency.

The Go binary in this repo (cmd/pleno-dlp) remains for filesystem-only scans; the Python package is the path forward for SaaS.

Install

uv tool install pleno-dlp
# or
pipx install pleno-dlp

# Add the PII backend (pulls pleno-anonymize):
uv tool install 'pleno-dlp[pii]'

Usage

The CLI is connector-agnostic: connector knobs flow through the generic --option key=value flag. Run pleno-dlp describe <connector> to see the accepted keys, types, defaults, and which ones are secrets.

# Discover what each connector takes
pleno-dlp list-connectors
pleno-dlp describe github

# Secret scan over an entire GitHub org (code + issues + PRs across every repo)
GITHUB_TOKEN=ghp_... pleno-dlp scan github --option owner=plenoai

# Scan a single repo, only code, with trufflehog verification
pleno-dlp scan github \
    --option owner=plenoai --option repo=pleno-dlp \
    --option resources=code --backend trufflehog

# Issue + PR conversations only, PII detection (requires pleno-anonymize)
pleno-dlp scan github --option owner=plenoai \
    --option resources=issues,prs --backend pii

# SARIF output for GitHub code-scanning ingestion
pleno-dlp scan github --option owner=plenoai \
    --format sarif > findings.sarif

# Slack workspace — the same shape, different connector
pleno-dlp scan slack --token xoxb-... --option include_threads=false

Auth resolution for github: --tokenGITHUB_TOKEN env var → gh auth token. Anonymous works for public content but is rate-limited to 60 req/h. Other connectors take their token via --token (shorthand for --option token=…) or via --option api_token=… / --option access_token=… depending on the auth mode (see describe).

Backends

Backend Class Verifies System dep
trufflehog secret yes (per-detector) trufflehog CLI on PATH
gitleaks secret no gitleaks CLI on PATH
native secret no none — bundled regex (AWS, GitHub PAT, Slack bot, OpenAI, Anthropic)
pii PII n/a pleno-anonymize (installed via pleno-dlp[pii] extra)

Connectors

Each connector self-describes via a ConnectorSpec (auth modes, resources, options, runtime capabilities). Today: github, gitlab, bitbucket (cloud + server), slack (xoxb / xoxp), notion, confluence (cloud + datacenter), jira (cloud + datacenter). Run pleno-dlp list-connectors for the live list and pleno-dlp describe <name> for the option sheet.

Adding a new SaaS connector

  1. Create python/src/saas_retriever/connectors/<name>.py.
  2. Implement the Connector protocol (discover, fetch, discover_and_fetch, capabilities, close). Keep one httpx.AsyncClient per instance.
  3. Declare a spec: ClassVar[ConnectorSpec] = ConnectorSpec(...)name, kind, summary, auth_modes, resources, options (every __init__ kwarg you want operators to set), and capabilities. The registry rejects registration without a matching spec.
  4. End the module with registry.register("<name>", <Class>).
  5. Wire the import in connectors/__init__.py so import saas_retriever populates the registry.
  6. Add fixtures + tests under python/tests/saas_retriever/test_<name>.py using httpx.MockTransport.

Once the spec lands, pleno-dlp scan <name> and pleno-dlp describe <name> work without touching the CLI.

Release

Tag py-vX.Y.Z triggers PyPI trusted publishing via GitHub Actions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pleno_dlp-0.8.0.tar.gz (79.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pleno_dlp-0.8.0-py3-none-any.whl (100.8 kB view details)

Uploaded Python 3

File details

Details for the file pleno_dlp-0.8.0.tar.gz.

File metadata

  • Download URL: pleno_dlp-0.8.0.tar.gz
  • Upload date:
  • Size: 79.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pleno_dlp-0.8.0.tar.gz
Algorithm Hash digest
SHA256 edd86e878a63552033d554bd8002c3710a7c73efae588953c818eb4b16cc3050
MD5 ac2cd21c7f1c5c5431d44a958c7bd7a9
BLAKE2b-256 0a3aa25123c9e923601561320a08758f671e0a2e3f227fde21470eb3b300be46

See more details on using hashes here.

Provenance

The following attestation bundles were made for pleno_dlp-0.8.0.tar.gz:

Publisher: release-py.yml on plenoai/pleno-dlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pleno_dlp-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: pleno_dlp-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 100.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for pleno_dlp-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e8e5c07e1ea2d0a004738deb88152b05abb29d5a51ef636af261d2e44312bee9
MD5 0e47642dce8a01386759192653672f5d
BLAKE2b-256 f55049efa29ce2b6480b547b105e78700e154f361a77c25f6e4a9efe859b77a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for pleno_dlp-0.8.0-py3-none-any.whl:

Publisher: release-py.yml on plenoai/pleno-dlp

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page