Local-only CLI for batch PII redaction in PDFs
Project description
Redactron — Local-only PII redaction for PDFs
Your files stay on your machine. No cloud. No subscription. No telemetry.
Redactron redacts PII from PDFs on your machine. No cloud. No telemetry. No subscriptions.
Define your PII once in a profile, run it against any number of documents, and get a verified redacted output. The PDF never leaves your machine.
Encrypted multi-client vault. Touch ID gated on macOS. Audit log. OCR fallback for scanned documents. AGPL-3.0.
Why Redactron?
Free online redactors are usually ad-supported and many run analytics on the documents you upload. For medical records, legal documents, or anything covered by HIPAA, GDPR, or attorney-client privilege, that is a serious concern. Adobe Acrobat uploads files to Adobe's servers. iLovePDF and SmallPDF are cloud services with freemium models. You have no visibility into what happens to your files after upload.
Redactron runs entirely on your machine. The codebase has no HTTP client dependency and no outbound socket calls. You can verify this with a packet capture while running a redaction.
The source is AGPL-3.0 and available for inspection. There is no black-box model deciding what to redact. You define exactly what gets removed, and the tool re-scans the output to confirm it worked.
For professional use, the vault stores multiple client profiles encrypted with AES-256-GCM. The master key lives in the macOS login keychain, gated by Touch ID. Every run is logged to a local SQLite database.
Quickstart
pip install redactron
redactron init
redactron vault init
# Get the profile template — use this for every new profile
redactron profile template --output /tmp/me.yaml
# edit /tmp/me.yaml with your name, addresses, account numbers, etc.
redactron profile add --client me --from /tmp/me.yaml
# Redact a single file
redactron run document.pdf --client me
# Redact multiple files in a folder — outputs go to ./documents/redacted/
redactron run ./documents/ --client me
document_redacted.pdf lands in the same directory as the input. For a folder, all redacted files go to documents/redacted/ and a consolidated summary report is written to documents/redacted-reports/.
Features
- Profile-driven. Define your PII once (names, aliases, addresses, phones, emails, SSNs, account numbers, custom regex) and redact any number of PDFs.
- Encrypted vault. AES-256-GCM encrypted multi-client profile store. Master key in macOS Keychain.
- Touch ID gate. LocalAuthentication soft-gate before every vault access on macOS.
- OCR fallback. Auto-triggers on image-only pages via pytesseract. No flag needed.
- Layout-aware. Column-aware address bridging prevents cross-column false positives in two-column PDFs.
- Verification. Re-scans the redacted output and reports any PII survivors.
- Audit log. SQLite record of every run (filename, detections, verification status).
- Batch mode.
redactron run ./docs/redacts an entire directory. Outputs go toredacted/subdir. - Consolidated report. Single
YYYY-MM-DD-HHMM_batch-summary.mdper batch run. - Dry run. Preview detections without writing output.
Profile example
version: 1
subject:
display_name: "Jane Smith"
aliases: ["Jane", "J. Smith"]
addresses: ["123 Main Street, Springfield, IL 62701"]
phones: ["+1-555-867-5309"]
emails: ["jane@example.com"]
account_numbers:
- value: "0021305789Q834"
preserve_last: 4
detection:
fuzzy_match: true
match_threshold: 0.85
Copy docs/examples/profile-template.yaml for the full annotated schema.
Multi-client vault
redactron vault init
redactron profile add --client alice --from alice.yaml
redactron profile add --client bob --from bob.yaml
redactron run statement.pdf --client alice
redactron profile list
Security model
The vault is AES-256-GCM encrypted at rest. On macOS, the master key is stored in the login keychain and access is gated by a Touch ID prompt via LocalAuthentication.
Touch ID is soft enforcement. It gates redactron's code path, not the keychain item itself. An unsigned Python package cannot use kSecAttrAccessControl (requires Apple code-signing entitlements). See docs/SECURITY.md for the full threat model.
Performance targets
| Scenario | Target |
|---|---|
| 10-page text PDF | < 3 seconds end-to-end |
| 10-page image PDF (OCR) | < 30 seconds |
| Peak memory per document | < 500 MB |
Platform support
| Platform | Status |
|---|---|
| macOS | First-class (Touch ID vault) |
| Linux | Planned for v1.1 (keyring via libsecret) |
| Windows | Planned for v1.1 (DPAPI) |
CLI reference
redactron run <path> [--client <id>] [--no-ocr] [--force-ocr] [--no-verify]
[--json] [--output <path>] [--quiet] [--per-file-reports]
redactron dry-run <path> [--json]
redactron verify <path>
redactron init
redactron vault init
redactron profile add --client <id> [--name <name>] [--from <yaml>]
redactron profile template [--output <path>] [--client <id>]
redactron profile list
redactron profile show <id> [--reveal]
redactron profile edit <id>
redactron profile delete <id>
redactron profile import <yaml> [--client <id>]
redactron log [--subject <id>] [--limit N]
redactron report <run-id>
redactron --version
Documentation
- docs/PROFILE.md — full profile schema reference
- docs/SECURITY.md — threat model, crypto choices, Touch ID implementation
- docs/PRIVACY.md — local-only guarantee, audit DB schema, AGPL licensing
- docs/RELEASING.md — how to cut a release
- CONTRIBUTING.md — dev setup, conventions, PR process
- CHANGELOG.md — version history
License
AGPL-3.0. See LICENSE.
Redactron depends on PyMuPDF which is also AGPL-3.0. If you distribute redactron as part of a proprietary product, the AGPL requires you to release your source. See docs/PRIVACY.md for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file redactron-1.0.0.tar.gz.
File metadata
- Download URL: redactron-1.0.0.tar.gz
- Upload date:
- Size: 277.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ef6fa0d696874af8159efcc2b0d424f9c7bdcb1d1bc157affac96db126c24b53
|
|
| MD5 |
722d71497be69573ca4522342f012616
|
|
| BLAKE2b-256 |
f2d4d4912506ea4cb6c196231e40a2ea8ba83886cdef485a32b1b7f6b454e272
|
Provenance
The following attestation bundles were made for redactron-1.0.0.tar.gz:
Publisher:
release.yml on tjndr/redactron
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
redactron-1.0.0.tar.gz -
Subject digest:
ef6fa0d696874af8159efcc2b0d424f9c7bdcb1d1bc157affac96db126c24b53 - Sigstore transparency entry: 1436996006
- Sigstore integration time:
-
Permalink:
tjndr/redactron@ec3020008ff1d4e776aeb0192082ebe04dea4a86 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/tjndr
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ec3020008ff1d4e776aeb0192082ebe04dea4a86 -
Trigger Event:
push
-
Statement type:
File details
Details for the file redactron-1.0.0-py3-none-any.whl.
File metadata
- Download URL: redactron-1.0.0-py3-none-any.whl
- Upload date:
- Size: 67.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cdde9bcbd394ddcb0d34bb39e6ea49176195bde7a755bcb517cb6cfb2f111923
|
|
| MD5 |
dab2389ea1077390b0afd68f429f4ae5
|
|
| BLAKE2b-256 |
24d4f3214493b902b6b07ab7bc19a338a4496b6f734be13dca66f100501f5dc9
|
Provenance
The following attestation bundles were made for redactron-1.0.0-py3-none-any.whl:
Publisher:
release.yml on tjndr/redactron
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
redactron-1.0.0-py3-none-any.whl -
Subject digest:
cdde9bcbd394ddcb0d34bb39e6ea49176195bde7a755bcb517cb6cfb2f111923 - Sigstore transparency entry: 1436996009
- Sigstore integration time:
-
Permalink:
tjndr/redactron@ec3020008ff1d4e776aeb0192082ebe04dea4a86 -
Branch / Tag:
refs/tags/v1.0.0 - Owner: https://github.com/tjndr
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@ec3020008ff1d4e776aeb0192082ebe04dea4a86 -
Trigger Event:
push
-
Statement type: