Skip to main content

LLM fragment plugin to load a PDF as a sequence of images

Project description

llm-pdf-to-images

PyPI Changelog Tests License

LLM fragment plugin to load a PDF as a sequence of images

Installation

Install this plugin in the same environment as LLM.

llm install llm-pdf-to-images

The llm-pdf-to-images plugin provides a fragment loader that converts each page of a PDF document into an image attachment.

You can use the pdf-to-images: fragment prefix to convert a PDF file into a series of image attachments which can be sent to a model.

Example usage:

llm -f pdf-to-images:path/to/document.pdf 'Summarize this document'

Fragment syntax

pdf-to-images:<path>?dpi=N&format=jpg|png&quality=Q
  • <path>: Path to the PDF file accessible to the environment where LLM runs.
  • dpi=N: (optional) Dots per inch to use when rendering the PDF pages, which affects the resolution of the output images. Defaults to 300 if omitted.
  • format=jpg|png: (optional) Image format to use for the output. Can be either jpg (default) or png.
  • quality=Q: (optional) JPEG quality factor between 1 and 100. Only applies when using JPG format. Defaults to 30 if omitted. Higher values produce better quality but larger file sizes.

More examples

Convert a PDF file to images with default settings (300 DPI, JPG format, quality 30):

llm -f pdf-to-images:document.pdf 'summarize this document'

Convert a PDF with higher resolution (600 DPI):

llm -f 'pdf-to-images:document.pdf?dpi=600' 'summarize'

Convert a PDF to PNG format:

llm -f 'pdf-to-images:document.pdf?format=png' 'describe all figures'

Convert a PDF with high-quality JPG images:

llm -f 'pdf-to-images:document.pdf?quality=90' 'extract all visible text'

Combine multiple parameters:

llm -f 'pdf-to-images:document.pdf?dpi=450&format=jpg&quality=75' 'OCR'

Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:

cd llm-pdf-to-images
python -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

python -m pip install -e '.[test]'

To run the tests:

python -m pytest

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_pdf_to_images-0.1.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_pdf_to_images-0.1-py3-none-any.whl (7.8 kB view details)

Uploaded Python 3

File details

Details for the file llm_pdf_to_images-0.1.tar.gz.

File metadata

  • Download URL: llm_pdf_to_images-0.1.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for llm_pdf_to_images-0.1.tar.gz
Algorithm Hash digest
SHA256 145e034289575cbce23a0332312635c3d8e33729fa5d6202420fad909dae2831
MD5 a28853e78ae6f673eb510cd4d05bee1d
BLAKE2b-256 1d2bea2c080e93e0a9e5d1d3450cdb34d2949e918d038e47233293a601e7d1d7

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_pdf_to_images-0.1.tar.gz:

Publisher: publish.yml on simonw/llm-pdf-to-images

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file llm_pdf_to_images-0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_pdf_to_images-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0e477b6f6eade17c79b613d234faf3681fd1864942eb3062fec870609a93cc72
MD5 794844b54aaf32007db3a3fcb57f9d9e
BLAKE2b-256 6dd06f5fe2da16461bebbe1f2a43a44cedcccab0c535daaa5f5e77570fcc831e

See more details on using hashes here.

Provenance

The following attestation bundles were made for llm_pdf_to_images-0.1-py3-none-any.whl:

Publisher: publish.yml on simonw/llm-pdf-to-images

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page