Skip to main content

A family of LLM-enhanced PDF utilities

Project description

pdf-llm-tools

pdf-llm-tools is a family of AI pdf utilities:

  • pdfllm-titler renames a pdf with metadata parsed from the filename and contents. In particular it renames it as YEAR-AUTHOR-TITLE.pdf.
  • (todo) pdfllm-toccer adds a bookmark structure parsed from the detected contents table of the pdf.

Currently OpenAI's gpt-3.5-turbo-1106 is hardcoded as the LLM backend. The program requires an OpenAI API key via option, envvar, or manual input.

Installation

pip install pdf-llm-tools

Usage

These utilities require all PDFs to have a correct OCR layer. Run something like OCRmyPDF if needed.

pdfllm-titler

pdfllm-titler a.pdf b.pdf c.pdf
pdfllm-titler --last-page 8 d.pdf

See --help for full details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_llm_tools-0.0.2.tar.gz (3.3 kB view hashes)

Uploaded Source

Built Distribution

pdf_llm_tools-0.0.2-py3-none-any.whl (4.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page