Unified DLP scanner for SaaS sources — secret detection (trufflehog, gitleaks, native regex) plus PII detection (pleno-anonymize). API-driven content collection from GitHub, GitLab, Bitbucket, Slack, Notion, Confluence, Jira.
Project description
pleno-dlp (Python)
Unified DLP scanner for SaaS content — secrets (trufflehog /
gitleaks / native regex) and PII (delegating to
pleno-anonymize). One
plugin model: every source (github, slack, jira, …) and every detector
(native, trufflehog, gitleaks, pii) is a connector registered under
pleno_dlp.connectors.*, distinguished by ConnectorSpec.role
(source or detector). pip install pleno-dlp pulls one wheel
exposing one console script (pleno-dlp).
The Go binary in this repo (cmd/pleno-dlp) remains for filesystem-only
scans; the Python package is the path forward for SaaS.
Install
uv tool install pleno-dlp
# or
pipx install pleno-dlp
# Add the PII backend (pulls pleno-anonymize):
uv tool install 'pleno-dlp[pii]'
Usage
The CLI is connector-agnostic: knobs flow through the generic
--option key=value flag (sources) and --detector-option /
-D key=value (detectors). Run
pleno-dlp describe <connector> for the accepted keys, types,
defaults, and which ones are secrets.
# Discover what's registered
pleno-dlp list # everything
pleno-dlp list --role source # SaaS sources only
pleno-dlp list --role detector # detection backends only
pleno-dlp describe github
pleno-dlp describe trufflehog
# Secret scan over an entire GitHub org with the default native detector
GITHUB_TOKEN=ghp_... pleno-dlp scan github --option owner=plenoai
# Scan a single repo, only code, with trufflehog verification
pleno-dlp scan github \
--option owner=plenoai --option repo=pleno-dlp \
--option resources=code --detector trufflehog
# Issue + PR conversations only, PII detection (requires pleno-anonymize)
pleno-dlp scan github --option owner=plenoai \
--option resources=issues,prs --detector pii \
--pii-base-url http://localhost:8000
# SARIF output for GitHub code-scanning ingestion
pleno-dlp scan github --option owner=plenoai \
--format sarif > findings.sarif
# Slack workspace — same shape, different source connector
pleno-dlp scan slack --token xoxb-... --option include_threads=false
Auth resolution for github: --token → GITHUB_TOKEN env var →
gh auth token. Anonymous works for public content but is rate-limited
to 60 req/h. Other source connectors take their token via --token
(shorthand for --option token=…) or via --option api_token=… /
--option access_token=… depending on the auth mode (see
describe).
Detectors
| Detector | Class | Verifies | System dep |
|---|---|---|---|
| trufflehog | secret | yes (per-detector) | trufflehog CLI on PATH |
| gitleaks | secret | no | gitleaks CLI on PATH |
| native | secret | no | none — bundled regex (AWS, GitHub PAT, Slack bot, OpenAI, Anthropic) |
| pii | PII | n/a | pleno-anonymize HTTP API (installed via pleno-dlp[pii] extra) |
Source connectors
Each source self-describes via a ConnectorSpec (auth modes,
resources, options, runtime capabilities). Today: github, gitlab,
bitbucket (cloud + server), slack (xoxb / xoxp), notion,
confluence (cloud + datacenter), jira (cloud + datacenter).
Run pleno-dlp list --role source for the live list and
pleno-dlp describe <name> for the option sheet.
Adding a new connector (source or detector)
- Create
python/src/pleno_dlp/connectors/<name>.py. - Implement the right Protocol:
- Source:
discover,fetch,discover_and_fetch,capabilities,close. Keep onehttpx.AsyncClientper instance. - Detector: a single
async def scan(self, doc: Document) -> AsyncIterator[Finding].
- Source:
- Declare a
spec: ClassVar[ConnectorSpec] = ConnectorSpec(...)with the rightrole(ConnectorRole.SOURCEorConnectorRole.DETECTOR),name,kind,summary,auth_modes,resources(sources only),options(every__init__kwarg you want operators to set), andcapabilities(sources only). - End the module with
registry.register("<name>", <Class>). - Wire the import in
pleno_dlp/connectors/__init__.py. - Add fixtures + tests under
python/tests/connectors/test_<name>.pyusinghttpx.MockTransport(or stdlib mocks for offline detectors).
Once the spec lands, pleno-dlp scan <source> --detector <det>,
pleno-dlp list, and pleno-dlp describe all work without touching
the CLI.
Release
Tag py-vX.Y.Z triggers PyPI trusted publishing via GitHub Actions.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pleno_dlp-0.10.0.tar.gz.
File metadata
- Download URL: pleno_dlp-0.10.0.tar.gz
- Upload date:
- Size: 78.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0bf5db8ac7957c9eaf3aa92033ba752961193e6f2bcdad00748f7db18d1609fb
|
|
| MD5 |
c7e68a93d2399b6ad46162a576fac19b
|
|
| BLAKE2b-256 |
765f5869e8cd3f77c21d2e5d5efb2baadd425451f90143b40076ae05d14c8818
|
Provenance
The following attestation bundles were made for pleno_dlp-0.10.0.tar.gz:
Publisher:
release-py.yml on plenoai/pleno-dlp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pleno_dlp-0.10.0.tar.gz -
Subject digest:
0bf5db8ac7957c9eaf3aa92033ba752961193e6f2bcdad00748f7db18d1609fb - Sigstore transparency entry: 1474103499
- Sigstore integration time:
-
Permalink:
plenoai/pleno-dlp@e4473e1b17c1d6ec4b542c5ed51937786255148c -
Branch / Tag:
refs/tags/py-v0.10.0 - Owner: https://github.com/plenoai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-py.yml@e4473e1b17c1d6ec4b542c5ed51937786255148c -
Trigger Event:
push
-
Statement type:
File details
Details for the file pleno_dlp-0.10.0-py3-none-any.whl.
File metadata
- Download URL: pleno_dlp-0.10.0-py3-none-any.whl
- Upload date:
- Size: 98.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e608dcdac173806aa9a064b4811a90c6a812e4106c0fae95c0072218c2428860
|
|
| MD5 |
4a1c74c761374742b0a4007937f28c5f
|
|
| BLAKE2b-256 |
33b30da8a15e849927da3ca948cdcd9332739cdd184276699067e7832c09eae9
|
Provenance
The following attestation bundles were made for pleno_dlp-0.10.0-py3-none-any.whl:
Publisher:
release-py.yml on plenoai/pleno-dlp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pleno_dlp-0.10.0-py3-none-any.whl -
Subject digest:
e608dcdac173806aa9a064b4811a90c6a812e4106c0fae95c0072218c2428860 - Sigstore transparency entry: 1474103555
- Sigstore integration time:
-
Permalink:
plenoai/pleno-dlp@e4473e1b17c1d6ec4b542c5ed51937786255148c -
Branch / Tag:
refs/tags/py-v0.10.0 - Owner: https://github.com/plenoai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-py.yml@e4473e1b17c1d6ec4b542c5ed51937786255148c -
Trigger Event:
push
-
Statement type: