A family of LLM-enhanced PDF utilities
Project description
pdf-llm-tools
pdf-llm-tools
is a family of AI pdf utilities:
pdfllm-titler
renames a pdf with metadata parsed from the filename and contents. In particular it renames it asYEAR-AUTHOR-TITLE.pdf
.- (todo)
pdfllm-toccer
adds a bookmark structure parsed from the detected contents table of the pdf.
Currently OpenAI's gpt-3.5-turbo-1106
is hardcoded as the LLM backend. The
program requires an API key via option, envvar, or manual input.
Installation
pip install pdf-llm-tools
Usage
These utilities require all PDFs to have a correct OCR layer. Run something like OCRmyPDF if needed.
pdfllm-titler
usage: pdfllm-titler [-h] [--openai-api-key OPENAI_API_KEY] [--first-page FIRST_PAGE] [--last-page LAST_PAGE] fpath [fpath ...]
Rename PDF documents according to their contents.
positional arguments:
fpath PDF to rename
options:
-h, --help show this help message and exit
--openai-api-key OPENAI_API_KEY
OpenAI API key
--first-page FIRST_PAGE, -f FIRST_PAGE
First page of pdf to read (default: 1)
--last-page LAST_PAGE, -l LAST_PAGE
Last page of pdf to read (default: 5)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pdf_llm_tools-0.0.1.tar.gz
(3.4 kB
view hashes)
Built Distribution
Close
Hashes for pdf_llm_tools-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39f8957905e5bb6cfed02bd0870f894533ceac0025b707fff7542ccea07604f5 |
|
MD5 | 5c2d39284e9f18b79b27be419370f279 |
|
BLAKE2b-256 | 5b9a885fc415356020701873d78753dc9e5aba43426a5a43fadb26ee40f053dc |