Unified DLP scanner for SaaS sources — secret detection (trufflehog, gitleaks, native regex) plus PII detection (pleno-anonymize). API-driven content collection from GitHub, GitLab, Bitbucket, Slack, Notion, Confluence, Jira.
Project description
pleno-dlp (Python)
Unified DLP scanner for SaaS content — secrets (trufflehog / gitleaks / native regex) and PII (delegating to pleno-anonymize).
A connector models a SaaS provider — github, gitlab, bitbucket,
slack, notion, confluence, jira — and owns the full lifecycle: walks
content through the provider's API, detects leaks in that content,
and (optionally) verifies / revokes credentials. Detection happens
inside the connector; the engine choice (native, trufflehog,
gitleaks, pii) is a per-connector option, not a separate plugin.
Every connector self-describes via ConnectorSpec.capabilities
(SOURCE + DETECT baseline, optional VERIFY / REVOKE).
pip install pleno-dlp pulls one wheel exposing one console script
(pleno-dlp). The Go binary in this repo (cmd/pleno-dlp) remains
for filesystem-only scans; the Python package is the path forward
for SaaS.
Install
uv tool install pleno-dlp
# or
pipx install pleno-dlp
# Add the PII backend (pulls pleno-anonymize):
uv tool install 'pleno-dlp[pii]'
Usage
The CLI is connector-agnostic: knobs flow through the generic
--option key=value flag, and the detection engine is picked with
--engine. Run pleno-dlp describe <connector> for the accepted
keys, types, defaults, and which ones are secrets.
# Discover what's registered
pleno-dlp list # connectors + engines
pleno-dlp list --capability verify # connectors with VERIFY
pleno-dlp describe github
# Secret scan over an entire GitHub org with the default native engine
GITHUB_TOKEN=ghp_... pleno-dlp scan github --option owner=plenoai
# Scan a single repo, only code, with trufflehog verification
pleno-dlp scan github \
--option owner=plenoai --option repo=pleno-dlp \
--option resources=code --engine trufflehog
# Issue + PR conversations only, PII detection (requires pleno-anonymize)
pleno-dlp scan github --option owner=plenoai \
--option resources=issues,prs --engine pii
# SARIF output for GitHub code-scanning ingestion
pleno-dlp scan github --option owner=plenoai \
--format sarif > findings.sarif
# Slack workspace — same shape, different source connector
pleno-dlp scan slack --token xoxb-... --option include_threads=false
# Confirm a leaked github PAT is still live
pleno-dlp verify github --token ghp_…
Auth resolution for github: --token → GITHUB_TOKEN env var →
gh auth token. Anonymous works for public content but is rate-limited
to 60 req/h. Other source connectors take their token via --token
(shorthand for --option token=…) or via --option api_token=… /
--option access_token=… depending on the auth mode (see
describe).
Detection engines
Engines are the internal scanners connectors compose with. They are
stateless utilities that turn a Document.text into Finding\s.
Operators do not address them directly — instead pick one with
--engine (or --option engine=…); the connector hands its own
Documents to the chosen engine. Default for every connector: native.
| Engine | Class | Verifies | System dep |
|---|---|---|---|
| trufflehog | secret | yes (per-detector) | trufflehog CLI on PATH |
| gitleaks | secret | no | gitleaks CLI on PATH |
| native | secret | no | none — bundled regex (AWS, GitHub PAT, Slack bot, OpenAI, Anthropic) |
| pii | PII | n/a | pleno-anonymize HTTP API (installed via pleno-dlp[pii] extra) |
Source connectors
Each connector self-describes via a ConnectorSpec (auth modes,
resources, options, runtime capabilities). Today: github, gitlab,
bitbucket (cloud + server), slack (xoxb / xoxp), notion,
confluence (cloud + datacenter), jira (cloud + datacenter).
Run pleno-dlp list for the live list and pleno-dlp describe <name>
for the option sheet.
Capabilities
A connector advertises one or more capabilities:
Capability.SOURCE— implements theConnectorProtocol (discover/fetch/capabilities). Every shipped connector has this.Capability.DETECT— implements theDetectorProtocol (detect(doc) -> AsyncIterator[Finding]). Every shipped connector has this; the engine choice is configured via--option engine=….Capability.VERIFY— implements theVerifierProtocol (verify(secret) -> VerifyResult). Today: github (probesGET /user).Capability.REVOKE— implements theRevokerProtocol (revoke(secret) -> RevokeResult). Reserved; no built-in connector has this yet — providers without a programmatic revoke endpoint should leave it unset and document the manual rotation flow.
pleno-dlp verify <connector> --token … exercises VERIFY. Exit
codes: 0 = LIVE, 1 = REVOKED, 2 = UNKNOWN/unsupported.
Adding a new connector
- Create
python/src/pleno_dlp/connectors/<name>.py. SubclassDetectViaEngineMixinfrompleno_dlp.connectors._detectsodetect()and theenginekwarg come for free. - Implement the
ConnectorProtocol (discover,fetch,discover_and_fetch,capabilities,close). Keep onehttpx.AsyncClientper instance. Callself._init_engine(engine)from your__init__andawait self._close_engine()from yourclose(). Optionally addverify(secret)/revoke(secret)for lifecycle support. - Declare a
spec: ClassVar[ConnectorSpec] = ConnectorSpec(...)withname,kind,summary,capabilities(defaults to{SOURCE, DETECT}— extend withVERIFY/REVOKEas you implement them),auth_modes,resources,options(every__init__kwarg, includingDETECT_ENGINE_OPTIONfrom_detect), andruntime(aCapabilitiesdescribing incremental / streaming / concurrency). - End the module with
registry.register("<name>", <Class>). - Wire the import in
pleno_dlp/connectors/__init__.py. - Add fixtures + tests under
python/tests/connectors/test_<name>.pyusinghttpx.MockTransport.
Once the spec lands, pleno-dlp scan <name> --engine <engine>,
pleno-dlp verify <name>, pleno-dlp list, and
pleno-dlp describe all work without touching the CLI.
Release
Tag py-vX.Y.Z triggers PyPI trusted publishing via GitHub Actions.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pleno_dlp-0.12.0.tar.gz.
File metadata
- Download URL: pleno_dlp-0.12.0.tar.gz
- Upload date:
- Size: 83.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc9181d0a63a3aac3f010ca5e624c8f1bc22495a5d0bbf232daa3f49dcd55a0e
|
|
| MD5 |
dd791b1654b423931253e929cc8e1114
|
|
| BLAKE2b-256 |
14107d67f5ce0e05b2daf53a41c7639a72a5d2e703d033341efef113f0da105e
|
Provenance
The following attestation bundles were made for pleno_dlp-0.12.0.tar.gz:
Publisher:
release-py.yml on plenoai/pleno-dlp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pleno_dlp-0.12.0.tar.gz -
Subject digest:
fc9181d0a63a3aac3f010ca5e624c8f1bc22495a5d0bbf232daa3f49dcd55a0e - Sigstore transparency entry: 1475395111
- Sigstore integration time:
-
Permalink:
plenoai/pleno-dlp@e4e0c1457023e38b00e80db552237bdcae8b3e5e -
Branch / Tag:
refs/tags/py-v0.12.0 - Owner: https://github.com/plenoai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-py.yml@e4e0c1457023e38b00e80db552237bdcae8b3e5e -
Trigger Event:
push
-
Statement type:
File details
Details for the file pleno_dlp-0.12.0-py3-none-any.whl.
File metadata
- Download URL: pleno_dlp-0.12.0-py3-none-any.whl
- Upload date:
- Size: 104.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d511f4a87ba2ee0a602aef715a27f49c5b453a81e2a37ceae9c4d8a5d6e668c4
|
|
| MD5 |
3922528c834a5d99d99849b8be3092fe
|
|
| BLAKE2b-256 |
664c7af66b36879c54e89d81e7e2ec5eb50f3684f4f3807e66c9cf6150bb93cb
|
Provenance
The following attestation bundles were made for pleno_dlp-0.12.0-py3-none-any.whl:
Publisher:
release-py.yml on plenoai/pleno-dlp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pleno_dlp-0.12.0-py3-none-any.whl -
Subject digest:
d511f4a87ba2ee0a602aef715a27f49c5b453a81e2a37ceae9c4d8a5d6e668c4 - Sigstore transparency entry: 1475395164
- Sigstore integration time:
-
Permalink:
plenoai/pleno-dlp@e4e0c1457023e38b00e80db552237bdcae8b3e5e -
Branch / Tag:
refs/tags/py-v0.12.0 - Owner: https://github.com/plenoai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-py.yml@e4e0c1457023e38b00e80db552237bdcae8b3e5e -
Trigger Event:
push
-
Statement type: