Skip to main content

Convert image files to PDF.

Project description

Image to PDF

Convert images to PDF files, utilizing Pillow, while attempting to handle alpha channels in a suitable manner.

Please note that (at least in my opinion) this package is only required if you really care about alpha channels and have no common set of image properties to rely on. Otherwise, most cases can probably be covered in a sufficient manner by using Pillow directly instead of this wrapper package.

About

I have always been looking for a good and reliable way to convert images to PDF files, while supporting images with alpha channels and avoiding undesired data loss.

My usual approach for converting images to PDF has been the implementation from Pillow as this would at least warn about data loss when handling alpha channels. Version 9.5.0 introduced the ability to convert RGBA images to PDF files as well by using JPEG2000/JPX images: #6925. But why do we need another library when Pillow apparently solved this problem some time ago?

During some tests, it turned out that neither pdf.js nor Poppler/Cairo would correctly render these PDF files: #8074. Due to pdf.js being my default PDF viewer for the web browser and pdftocairo my usual way to convert PDF files to thumbnails etc., this did not look right. Although pdf.js has fixed the bug in the meantime, it seems like JPX files with alpha channels are not handled sufficiently by all viewers, thus restricting the possible user base.

In the process of my report about the issues to Pillow, a pull request came up which would replace the alpha handling with SMasks: #8097. Unfortunately, it has been closed in the meantime since pdf.js fixed the bug. As this approach proved to be the most stable in my opinion, I decided to create an own package for it, while incorporating some basic feedback from user sl2c to not use /DCTDecode for RGBA images, but /FlateDecode instead. During testing, it turned out that pypdf would be able to recover the original image representation, even when the PDF had a white image on white background - this is just to make sure that ideally no data gets lost, although most of my usual RGBA images are indeed regular screenshots where removing the transparency altogether would not hurt, but this generally requires some more complex approaches to detect it.

Further alternatives I considered:

  • Using image.paste through Pillow to copy the possibly transparent image onto a white background. This has the side effect of basically making all white areas of the original image unreadable if the background has been transparent.
  • Using pdfrwx: I have not been able to recover the original image with its transparency in all cases. Additionally, this repository is neither packaged nor does it have any unit or integration tests. It depends on both NumPy and SciPy as well as potrace (GPL-2.0-or-later), being quite heavy and possibly introducing viral effects. pdfrw, another dependency, has been unmaintained for quite some time.

Installation

You can install this package from PyPI:

python -m pip install image-to-pdf

Usage

The main entry point is image_to_pdf.save(), which will convert one image to a PDF file by default. For all other functionality and some examples, I recommend you to have a look at the tests.

Additionally, the general API is compatible to the original PIL.PdfImagePlugin. Thus, you can configure this package to overwrite the original PdfImagePlugin functionality. See IntegrationTestCase for an example.

License

Like Pillow, which most of the code is based upon, this package is subject to the terms of the MIT-CMU license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

image_to_pdf-0.2.0.tar.gz (117.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

image_to_pdf-0.2.0-py3-none-any.whl (8.5 kB view details)

Uploaded Python 3

File details

Details for the file image_to_pdf-0.2.0.tar.gz.

File metadata

  • Download URL: image_to_pdf-0.2.0.tar.gz
  • Upload date:
  • Size: 117.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for image_to_pdf-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7fb711ed2fc0d7853371c5bc2d377838ef0b6afb6b740ddeb085f135ad927eac
MD5 ac67f2758a0d3a4ac9251bd8835017bf
BLAKE2b-256 0f04d41fb1f1a5934da513cd6e2c16269b15d0563746d4244cb2734660df76d6

See more details on using hashes here.

Provenance

The following attestation bundles were made for image_to_pdf-0.2.0.tar.gz:

Publisher: release.yml on stefan6419846/image-to-pdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file image_to_pdf-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: image_to_pdf-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 8.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for image_to_pdf-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 749a7da0e174a3240a2661050bd14a4b60893a597c9b43d5f3b81dbb869befd9
MD5 d80ff521ce970bfb71570f60759ba70a
BLAKE2b-256 4d6c4bf639be5deee561ce9957351fcf8c8945533535ff206f2bbe774b5b46fb

See more details on using hashes here.

Provenance

The following attestation bundles were made for image_to_pdf-0.2.0-py3-none-any.whl:

Publisher: release.yml on stefan6419846/image-to-pdf

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page