Skip to main content

A tool for comparing PDF files

Project description

DiffPDF

CI

CLI tool for detecting structural, textual, and visual differences between PDF files, for use in automatic regression tests.

Installation

pip install diffpdf

Usage

diffpdf <baseline.pdf> <actual.pdf> [OPTIONS]

How It Works

DiffPDF uses a fail-fast sequential pipeline to compare PDFs:

  1. Hash Check - SHA-256 comparison. If identical, exit immediately with pass.
  2. Page Count - Verify both PDFs have the same number of pages.
  3. Text Content - Extract and compare text from all pages.
  4. Visual Check - Render pages to images and compare using pixelmatch.

Each stage only runs if all previous stages pass.

⚠️ Performance Warning: The Python port of pixelmatch is extremely slow.

Options

Option Default Description
--threshold 0.1 Pixelmatch threshold (0.0-1.0)
--dpi 96 Render resolution
--output-dir ./ Directory for diff images
--debug - Verbose logging
--save-log - Write log to log.txt

Exit Codes

  • 0 — Pass (PDFs are equivalent)
  • 1 — Fail (differences detected)
  • 2 — Error (invalid input or processing error)

Development

pip install -e .[dev]
pytest tests/ -v
ruff check .

Acknowledgements

Built with PyMuPDF for PDF parsing and pixelmatch-py (Python port of pixelmatch) for visual comparison.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffpdf-0.1.2.tar.gz (45.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diffpdf-0.1.2-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file diffpdf-0.1.2.tar.gz.

File metadata

  • Download URL: diffpdf-0.1.2.tar.gz
  • Upload date:
  • Size: 45.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for diffpdf-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ba26d83783940e782aa868fea427c68ecde1d6999a6fc56878d4dfa658b7afb6
MD5 0ab3840377cadbf24a08e2ccd7b85c9a
BLAKE2b-256 6de06633853f3d33678ae752d6642eb4a6e39e32b477774cfd11d972f1fce508

See more details on using hashes here.

Provenance

The following attestation bundles were made for diffpdf-0.1.2.tar.gz:

Publisher: pypi-publish.yml on JustusRijke/DiffPDF

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diffpdf-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: diffpdf-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for diffpdf-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 afe282a4d762e65a775a35e820d86e4e23f3debef0af674e369f480f2cd8771f
MD5 e3e95559fba7e74d9cbacecfc93822e6
BLAKE2b-256 212909c39257d65c5b1e40b0b0555729800857646d7035242c2d1fd7bfb9964b

See more details on using hashes here.

Provenance

The following attestation bundles were made for diffpdf-0.1.2-py3-none-any.whl:

Publisher: pypi-publish.yml on JustusRijke/DiffPDF

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page