Real find and replace on the actual text in a PDF. No overlay, metadata preserved.
Project description
pdfblah
Real find and replace on the actual text in a PDF, from the command line.
Most tools "edit" a PDF by painting a box over the old text and drawing new text
on top, which leaves the original underneath (copy and paste still reveals it) and
often adds a watermark. pdfblah rewrites the real text in the content stream, so:
- the old text is genuinely gone (
pdftotext, Ctrl-F, and copy show only the new value) - no overlay, no watermark
- the original metadata (dates, Producer, XMP) is preserved byte for byte
- alignment is auto-detected and kept, so right-aligned numbers stay flush
- fonts it cannot reproduce are refused instead of garbled
Pure Python (pdfplumber + pikepdf). No system dependencies.
Install
pipx install pdfblah # or: pip install pdfblah
Use
Replace the first match:
pdfblah in.pdf out.pdf --find "Old Name" --replace "New Name"
Options:
--scope all change every match (default: first)
--scope 3 change the 3rd match
--ci ignore case
--word whole word only ("cat" will not match "category")
--page 2 only page 2
--replace "" delete the text
Many rules from a file (FIND | REPLACE | FLAGS per line):
pdfblah in.pdf out.pdf --rules rules.txt
# rules.txt
Old Company Name | New Company Name | all
CONFIDENTIAL DRAFT | FINAL | ci
Jane Doe | John Smith | all word
Total | Sum | 2
delete this phrase |
Library
from pdfblah import process, apply_rules, parse_rules_file
process("in.pdf", "out.pdf", "999.00", "42.00", scope="all", ci=True)
Each call returns a report dict (ok, count, refused, reason, ...).
What it does not do
Scanned PDFs (image only, no text layer) cannot be edited. Fonts that are not embedded and not standard, or use a custom encoding, are refused rather than rendered wrong. This is by design: a wrong-looking edit is worse than a clear "no".
Hosted version
Want it without installing anything, or for a non-technical colleague? The hosted version at pdfblah.com does the same edit in the browser: upload, preview for free, download.
License
MIT, (c) 2026 Kuvop LLC.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdfblah-0.1.0.tar.gz.
File metadata
- Download URL: pdfblah-0.1.0.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4057c0ba24bc60302649ef59816de55f584d7f88e23624a75ad4735ce1383a51
|
|
| MD5 |
76991ef9424e1708cf3e281a3280deff
|
|
| BLAKE2b-256 |
3ea9017b6472c1ab0755a4672ab199e274c0820e1ba578a09140ab5a5406f1ab
|
Provenance
The following attestation bundles were made for pdfblah-0.1.0.tar.gz:
Publisher:
pypi.yml on KuvopLLC/pdfblah
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdfblah-0.1.0.tar.gz -
Subject digest:
4057c0ba24bc60302649ef59816de55f584d7f88e23624a75ad4735ce1383a51 - Sigstore transparency entry: 2078559620
- Sigstore integration time:
-
Permalink:
KuvopLLC/pdfblah@3fd58655aa6082df284e76b34d4a8cb7491a7dc3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/KuvopLLC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@3fd58655aa6082df284e76b34d4a8cb7491a7dc3 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file pdfblah-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pdfblah-0.1.0-py3-none-any.whl
- Upload date:
- Size: 11.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3c1faf8217395b1d6d0c426dc846d717e82f4188a9c6593520fa158698524bf
|
|
| MD5 |
34c7e2ac2b78c0e4e7d1eda2252a8351
|
|
| BLAKE2b-256 |
24409322d1f54d4083f0f849a1f520f395932397d95d87c7f81718ddef125332
|
Provenance
The following attestation bundles were made for pdfblah-0.1.0-py3-none-any.whl:
Publisher:
pypi.yml on KuvopLLC/pdfblah
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdfblah-0.1.0-py3-none-any.whl -
Subject digest:
c3c1faf8217395b1d6d0c426dc846d717e82f4188a9c6593520fa158698524bf - Sigstore transparency entry: 2078559719
- Sigstore integration time:
-
Permalink:
KuvopLLC/pdfblah@3fd58655aa6082df284e76b34d4a8cb7491a7dc3 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/KuvopLLC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi.yml@3fd58655aa6082df284e76b34d4a8cb7491a7dc3 -
Trigger Event:
workflow_dispatch
-
Statement type: