Skip to main content

Real find and replace on the actual text in a PDF. No overlay, metadata preserved.

Project description

pdfblah

Real find and replace on the actual text in a PDF, from the command line.

Most tools "edit" a PDF by painting a box over the old text and drawing new text on top, which leaves the original underneath (copy and paste still reveals it) and often adds a watermark. pdfblah rewrites the real text in the content stream, so:

  • the old text is genuinely gone (pdftotext, Ctrl-F, and copy show only the new value)
  • no overlay, no watermark
  • the original metadata (dates, Producer, XMP) is preserved byte for byte
  • alignment is auto-detected and kept, so right-aligned numbers stay flush
  • fonts it cannot reproduce are refused instead of garbled

Pure Python (pdfplumber + pikepdf). No system dependencies.

Install

pipx install pdfblah      # or:  pip install pdfblah

Use

Replace the first match:

pdfblah in.pdf out.pdf --find "Old Name" --replace "New Name"

Options:

--scope all         change every match           (default: first)
--scope 3           change the 3rd match
--ci                ignore case
--word              whole word only ("cat" will not match "category")
--page 2            only page 2
--replace ""        delete the text

Many rules from a file (FIND | REPLACE | FLAGS per line):

pdfblah in.pdf out.pdf --rules rules.txt
# rules.txt
Old Company Name | New Company Name | all
CONFIDENTIAL DRAFT | FINAL | ci
Jane Doe | John Smith | all word
Total | Sum | 2
delete this phrase |

Library

from pdfblah import process, apply_rules, parse_rules_file

process("in.pdf", "out.pdf", "999.00", "42.00", scope="all", ci=True)

Each call returns a report dict (ok, count, refused, reason, ...).

What it does not do

Scanned PDFs (image only, no text layer) cannot be edited. Fonts that are not embedded and not standard, or use a custom encoding, are refused rather than rendered wrong. This is by design: a wrong-looking edit is worse than a clear "no".

Hosted version

Want it without installing anything, or for a non-technical colleague? The hosted version at pdfblah.com does the same edit in the browser: upload, preview for free, download.

License

MIT, (c) 2026 Kuvop LLC.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfblah-0.1.0.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdfblah-0.1.0-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file pdfblah-0.1.0.tar.gz.

File metadata

  • Download URL: pdfblah-0.1.0.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pdfblah-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4057c0ba24bc60302649ef59816de55f584d7f88e23624a75ad4735ce1383a51
MD5 76991ef9424e1708cf3e281a3280deff
BLAKE2b-256 3ea9017b6472c1ab0755a4672ab199e274c0820e1ba578a09140ab5a5406f1ab

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdfblah-0.1.0.tar.gz:

Publisher: pypi.yml on KuvopLLC/pdfblah

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pdfblah-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pdfblah-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pdfblah-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c3c1faf8217395b1d6d0c426dc846d717e82f4188a9c6593520fa158698524bf
MD5 34c7e2ac2b78c0e4e7d1eda2252a8351
BLAKE2b-256 24409322d1f54d4083f0f849a1f520f395932397d95d87c7f81718ddef125332

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdfblah-0.1.0-py3-none-any.whl:

Publisher: pypi.yml on KuvopLLC/pdfblah

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page