Unified DLP scanner for SaaS sources — secret detection (trufflehog, gitleaks, native regex) plus PII detection (pleno-anonymize). API-driven content collection from GitHub, GitLab, Bitbucket, Slack, Notion, Confluence, Jira.
Project description
pleno-dlp (Python)
Unified DLP scanner for SaaS content — secrets (trufflehog / gitleaks / native regex) and PII (delegating to pleno-anonymize).
A connector models a SaaS provider — github, gitlab, bitbucket,
slack, notion, confluence, jira — and walks its content through the
provider's API. A detection engine turns text into findings; the
four built-ins (native, trufflehog, gitleaks, pii) live under
pleno_dlp.engines and apply equally to any connector's output.
Every connector self-describes via ConnectorSpec.capabilities
(SOURCE, optional VERIFY / REVOKE for secret-lifecycle ops).
pip install pleno-dlp pulls one wheel exposing one console script
(pleno-dlp). The Go binary in this repo (cmd/pleno-dlp) remains
for filesystem-only scans; the Python package is the path forward
for SaaS.
Install
uv tool install pleno-dlp
# or
pipx install pleno-dlp
# Add the PII backend (pulls pleno-anonymize):
uv tool install 'pleno-dlp[pii]'
Usage
The CLI is connector-agnostic: knobs flow through the generic
--option key=value flag, and the detection engine is picked with
--engine. Run pleno-dlp describe <connector> for the accepted
keys, types, defaults, and which ones are secrets.
# Discover what's registered
pleno-dlp list # connectors + engines
pleno-dlp list --capability verify # connectors with VERIFY
pleno-dlp describe github
# Secret scan over an entire GitHub org with the default native engine
GITHUB_TOKEN=ghp_... pleno-dlp scan github --option owner=plenoai
# Scan a single repo, only code, with trufflehog verification
pleno-dlp scan github \
--option owner=plenoai --option repo=pleno-dlp \
--option resources=code --engine trufflehog
# Issue + PR conversations only, PII detection (requires pleno-anonymize)
pleno-dlp scan github --option owner=plenoai \
--option resources=issues,prs --engine pii \
--pii-base-url http://localhost:8000
# SARIF output for GitHub code-scanning ingestion
pleno-dlp scan github --option owner=plenoai \
--format sarif > findings.sarif
# Slack workspace — same shape, different source connector
pleno-dlp scan slack --token xoxb-... --option include_threads=false
# Confirm a leaked github PAT is still live
pleno-dlp verify github --token ghp_…
Auth resolution for github: --token → GITHUB_TOKEN env var →
gh auth token. Anonymous works for public content but is rate-limited
to 60 req/h. Other source connectors take their token via --token
(shorthand for --option token=…) or via --option api_token=… /
--option access_token=… depending on the auth mode (see
describe).
Detection engines
Engines are not connectors — they are stateless utilities that turn a
Document.text into Finding\s. Pick one with --engine.
| Engine | Class | Verifies | System dep |
|---|---|---|---|
| trufflehog | secret | yes (per-detector) | trufflehog CLI on PATH |
| gitleaks | secret | no | gitleaks CLI on PATH |
| native | secret | no | none — bundled regex (AWS, GitHub PAT, Slack bot, OpenAI, Anthropic) |
| pii | PII | n/a | pleno-anonymize HTTP API (installed via pleno-dlp[pii] extra) |
Source connectors
Each connector self-describes via a ConnectorSpec (auth modes,
resources, options, runtime capabilities). Today: github, gitlab,
bitbucket (cloud + server), slack (xoxb / xoxp), notion,
confluence (cloud + datacenter), jira (cloud + datacenter).
Run pleno-dlp list for the live list and pleno-dlp describe <name>
for the option sheet.
Capabilities
A connector advertises one or more capabilities:
Capability.SOURCE— implements theConnectorProtocol (discover/fetch/capabilities). Every shipped connector has this.Capability.VERIFY— implements theVerifierProtocol (verify(secret) -> VerifyResult). Today: github (probesGET /user).Capability.REVOKE— implements theRevokerProtocol (revoke(secret) -> RevokeResult). Reserved; no built-in connector has this yet — providers without a programmatic revoke endpoint should leave it unset and document the manual rotation flow.
pleno-dlp verify <connector> --token … exercises VERIFY. Exit
codes: 0 = LIVE, 1 = REVOKED, 2 = UNKNOWN/unsupported.
Adding a new connector
- Create
python/src/pleno_dlp/connectors/<name>.py. - Implement at least the
ConnectorProtocol (discover,fetch,discover_and_fetch,capabilities,close). Keep onehttpx.AsyncClientper instance. Optionally addverify(secret)/revoke(secret)for lifecycle support. - Declare a
spec: ClassVar[ConnectorSpec] = ConnectorSpec(...)withname,kind,summary,capabilities(frozenset of theCapabilityvalues you implement; defaults to{SOURCE}),auth_modes,resources,options(every__init__kwarg you want operators to set), andruntime(aCapabilitiesdescribing incremental / streaming / concurrency). - End the module with
registry.register("<name>", <Class>). - Wire the import in
pleno_dlp/connectors/__init__.py. - Add fixtures + tests under
python/tests/connectors/test_<name>.pyusinghttpx.MockTransport.
Once the spec lands, pleno-dlp scan <name> --engine <engine>,
pleno-dlp verify <name>, pleno-dlp list, and
pleno-dlp describe all work without touching the CLI.
Release
Tag py-vX.Y.Z triggers PyPI trusted publishing via GitHub Actions.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pleno_dlp-0.11.0.tar.gz.
File metadata
- Download URL: pleno_dlp-0.11.0.tar.gz
- Upload date:
- Size: 81.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7af78fca7687b23033834dc61336ccec0ba012e6179d98f331094a0a3dd00f31
|
|
| MD5 |
22fc43ee9f82c0d326ee89a395d775ed
|
|
| BLAKE2b-256 |
f1686465a16a84b7b3b22316bf5e72eedc8ed4170ed5cbaca0325b0c14fbe42c
|
Provenance
The following attestation bundles were made for pleno_dlp-0.11.0.tar.gz:
Publisher:
release-py.yml on plenoai/pleno-dlp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pleno_dlp-0.11.0.tar.gz -
Subject digest:
7af78fca7687b23033834dc61336ccec0ba012e6179d98f331094a0a3dd00f31 - Sigstore transparency entry: 1474330218
- Sigstore integration time:
-
Permalink:
plenoai/pleno-dlp@cdfb41d273a02500b7f5757bbca8426c36ee52bf -
Branch / Tag:
refs/tags/py-v0.11.0 - Owner: https://github.com/plenoai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-py.yml@cdfb41d273a02500b7f5757bbca8426c36ee52bf -
Trigger Event:
push
-
Statement type:
File details
Details for the file pleno_dlp-0.11.0-py3-none-any.whl.
File metadata
- Download URL: pleno_dlp-0.11.0-py3-none-any.whl
- Upload date:
- Size: 102.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
655fa43fae919dc5145c2a2a1625f72212c248d64303acb1e54b9d98b42db4f8
|
|
| MD5 |
a04e56bc62f81a373ff06d1b34b0b676
|
|
| BLAKE2b-256 |
629fd0c48ed85bd224e024aefec61cb6920d4394824d43772ba53f56f7767d77
|
Provenance
The following attestation bundles were made for pleno_dlp-0.11.0-py3-none-any.whl:
Publisher:
release-py.yml on plenoai/pleno-dlp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pleno_dlp-0.11.0-py3-none-any.whl -
Subject digest:
655fa43fae919dc5145c2a2a1625f72212c248d64303acb1e54b9d98b42db4f8 - Sigstore transparency entry: 1474330258
- Sigstore integration time:
-
Permalink:
plenoai/pleno-dlp@cdfb41d273a02500b7f5757bbca8426c36ee52bf -
Branch / Tag:
refs/tags/py-v0.11.0 - Owner: https://github.com/plenoai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-py.yml@cdfb41d273a02500b7f5757bbca8426c36ee52bf -
Trigger Event:
push
-
Statement type: