Skip to main content

Real find and replace on the actual text in a PDF. No overlay, metadata preserved.

Project description

pdfblah

PyPI Python CI License: MIT

Real find and replace on the actual text in a PDF, from the command line.

Most tools "edit" a PDF by painting a box over the old text and drawing new text on top, which leaves the original underneath (copy and paste still reveals it) and often adds a watermark. pdfblah rewrites the real text in the content stream, so:

  • the old text is genuinely gone (pdftotext, Ctrl-F, and copy show only the new value)
  • no overlay, no watermark
  • the original metadata (dates, Producer, XMP) is preserved byte for byte
  • alignment is auto-detected and kept, so right-aligned numbers stay flush
  • fonts it cannot reproduce are refused instead of garbled

Pure Python (pdfplumber + pikepdf). No system dependencies.

Install

pipx install pdfblah      # recommended, isolated; or:  pip install pdfblah

On a Mac with Homebrew, use Homebrew's pipx:

brew install pipx && pipx install pdfblah

Also works with uv: uv tool install pdfblah.

Use

Replace the first match:

pdfblah in.pdf out.pdf --find "Old Name" --replace "New Name"

Options:

--scope all         change every match           (default: first)
--scope 3           change the 3rd match
--ci                ignore case
--word              whole word only ("cat" will not match "category")
--page 2            only page 2
--replace ""        delete the text

Many rules from a file (FIND | REPLACE | FLAGS per line):

pdfblah in.pdf out.pdf --rules rules.txt
# rules.txt
Old Company Name | New Company Name | all
CONFIDENTIAL DRAFT | FINAL | ci
Jane Doe | John Smith | all word
Total | Sum | 2
delete this phrase |

Library

from pdfblah import process, apply_rules, parse_rules_file

process("in.pdf", "out.pdf", "999.00", "42.00", scope="all", ci=True)

Each call returns a report dict (ok, count, refused, reason, ...).

What it does not do

Scanned PDFs (image only, no text layer) cannot be edited. Fonts that are not embedded and not standard, or use a custom encoding, are refused rather than rendered wrong. This is by design: a wrong-looking edit is worse than a clear "no".

Hosted version

Want it without installing anything, or for a non-technical colleague? The hosted version at pdfblah.com does the same edit in the browser: upload, preview for free, download.

License

MIT, (c) 2026 Kuvop LLC.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfblah-0.2.0.tar.gz (12.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdfblah-0.2.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file pdfblah-0.2.0.tar.gz.

File metadata

  • Download URL: pdfblah-0.2.0.tar.gz
  • Upload date:
  • Size: 12.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pdfblah-0.2.0.tar.gz
Algorithm Hash digest
SHA256 66b35f4d4ee39a0bc4d2ade2d9dad9e68d76728244c03361f297b0b3d79fbeab
MD5 f1aaaa7d367785c344495230613ec481
BLAKE2b-256 713c34ad1183f1f3142616dcf0b0dc2874de48b1b26b3ec6538cd1f844995592

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdfblah-0.2.0.tar.gz:

Publisher: pypi.yml on KuvopLLC/pdfblah

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pdfblah-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pdfblah-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pdfblah-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 eff0493145320fe32da539ab49eadf6409fcfd890e1dd311ea7e014203e2ec01
MD5 9fde907dfbc7834792d338815b8ac20a
BLAKE2b-256 283ef530fd214439ab7bb2c12ebf2d4ecb0fdb282ee676f465ad95e208647caa

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdfblah-0.2.0-py3-none-any.whl:

Publisher: pypi.yml on KuvopLLC/pdfblah

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page