A Python library and CLI tool that uses LLMs to enhance PDF files
Project description
A Python library and CLI toolkit that brings PDF files alive with the power of LLMs.
Highlights
| Feature | Details |
|---|---|
| Automatic TOC generation | Generate clickable Table of Contents (PDF bookmarks) using LLM inference with intelligent batching for arbitrarily large documents |
| Smart OCR detection | Automatically detects scanned PDFs and performs OCR via Tesseract when needed |
| Intelligent file renaming | Batch rename files using natural language instructions with LLM-powered inference and confidence scoring |
| Multi-provider LLM support | Use any LLM provider via LangChain: OpenAI, Anthropic, local models via Ollama, and more |
| TOC postprocessing | Optional second LLM pass cross-references against printed TOC pages to fix typos, remove duplicates, and correct hierarchy |
| TOML configuration | Set persistent defaults for any CLI option via pdfalive.toml config files with per-command sections |
| Built-in resilience | Automatic retry logic with exponential backoff for handling API rate limits |
Installation
Tesseract is required for OCR functionality. On macOS:
brew install tesseract
Install pdfalive via pip:
pip install pdfalive
Or run directly without installation using uvx:
uvx pdfalive generate-toc input.pdf output.pdf
Usage
Use --help on any command for detailed options:
pdfalive --help
pdfalive generate-toc --help
generate-toc
Generate a clickable Table of Contents using PDF bookmarks. The tool extracts font and text features from the PDF and uses an LLM to intelligently identify chapter and section headings.
pdfalive generate-toc input.pdf output.pdf
# Or modify the file in place
pdfalive generate-toc --inplace input.pdf
Choosing an LLM:
By default, pdfalive uses the latest OpenAI model. Use any LangChain-supported model:
# Use Claude
pdfalive generate-toc --model-identifier 'claude-sonnet-4-5' input.pdf output.pdf
# Use a local model via Ollama
pdfalive generate-toc --model-identifier 'ollama/llama3' input.pdf output.pdf
Set the appropriate API key for your provider (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.).
Scanned PDFs:
OCR is enabled by default. Scanned documents without extractable text are automatically detected and processed:
# Default: OCR text layer discarded after TOC generation (preserves file size)
pdfalive generate-toc scanned.pdf output.pdf
# Include OCR text layer in output (makes PDF searchable)
pdfalive generate-toc --ocr-output scanned.pdf output.pdf
# Disable automatic OCR entirely
pdfalive generate-toc --no-ocr input.pdf output.pdf
Postprocessing:
For documents with a printed table of contents page, enable LLM postprocessing to refine results:
pdfalive generate-toc --postprocess input.pdf output.pdf
Postprocessing uses an additional LLM call to:
- Remove duplicate entries and fix typos
- Cross-reference against any printed TOC found in the document
- Add missing entries and correct page numbers
Other options:
| Option | Description |
|---|---|
--inplace |
Modify the input file in place instead of creating a new output file |
--force |
Overwrite existing TOC if the PDF already has bookmarks |
--ocr-language |
Set OCR language (default: eng). Use Tesseract language codes |
--request-delay |
Delay between LLM calls for rate limiting (default: 10s) |
extract-text
Extract text from scanned PDFs using OCR and save to a new PDF with an embedded text layer:
pdfalive extract-text input.pdf output.pdf
# Or modify the file in place
pdfalive extract-text --inplace input.pdf
This creates a searchable/selectable PDF without generating a TOC.
Options:
| Option | Description |
|---|---|
--inplace |
Modify the input file in place instead of creating a new output file |
--force |
Force OCR even if document already has text |
--ocr-language |
Set OCR language (default: eng) |
--ocr-dpi |
DPI resolution for OCR processing (default: 300) |
rename
Intelligently rename files using LLM inference. Analyzes filenames and applies renaming rules based on natural language instructions.
pdfalive rename -q "Add 'REVIEWED_' prefix" *.pdf
Custom naming formats:
Specify exact formatting including special characters — the LLM respects brackets, parentheses, dashes, and other formatting:
pdfalive rename -q "[Author Last Name] - Title (Year).pdf" paper1.pdf paper2.pdf
Reading paths from a file:
When dealing with many files or long filenames that exceed command-line limits, use the -f/--input-file option to read paths from a text file (one per line):
# Generate a list of files to rename
find /path/to/docs -name "*.pdf" > files.txt
# Rename using the file list
pdfalive rename -q "Standardize filenames" -f files.txt
The input file supports comments (lines starting with #) and blank lines are ignored.
Workflow:
- The tool analyzes each filename and generates rename suggestions
- A preview table shows original names, proposed names, confidence scores, and reasoning
- Confirm or cancel the operation (unless
-yis used) - Files are renamed in place
Automatic confirmation:
pdfalive rename -q "Add sequential numbering prefix" -y *.pdf
Options:
| Option | Description |
|---|---|
-f, --input-file |
Read input file paths from a text file (one per line) |
--model-identifier |
Choose which LLM to use (default: gpt-5.2) |
-y, --yes |
Automatically apply renames without confirmation |
--show-token-usage |
Display token usage statistics (default: enabled) |
Configuration
pdfalive supports TOML configuration files for setting default options. This is useful for frequently-used settings like the --query argument for rename.
Config file locations (searched in order):
pdfalive.tomlor.pdfalive.tomlin the current directorypdfalive.tomlor.pdfalive.tomlin your home directory~/.config/pdfalive/pdfalive.toml
Example pdfalive.toml:
# Global settings (shared across commands)
[global]
model-identifier = "gpt-5.2"
show-token-usage = true
# Settings for generate-toc command
[generate-toc]
force = false
request-delay = 10.0
ocr = true
ocr-language = "eng"
ocr-dpi = 300
postprocess = false
# Settings for extract-text command
[extract-text]
ocr-language = "eng"
ocr-dpi = 300
force = false
# Settings for rename command
[rename]
query = "Rename to \"[Author Last Name] Book Title, Edition (Year).pdf\""
yes = false
Using a specific config file:
pdfalive --config /path/to/config.toml rename document.pdf
Override hierarchy:
- Code defaults (lowest priority)
- Config file values
- CLI arguments (highest priority)
CLI arguments always override config file settings.
Development
We use uv to manage the project:
# Install dependencies
uv sync
# Install in editable mode
uv pip install -e .
Code quality tools:
| Tool | Purpose |
|---|---|
| ruff | Formatting and linting |
| mypy | Static type checking |
| pytest | Unit testing |
| pre-commit | Git hooks for quality checks |
# Run linting
uv run ruff check .
uv run ruff format .
# Run type checking
uv run mypy pdfalive
# Run tests
uv run pytest
License
pdfalive is distributed under the terms of the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdfalive-0.12.0.tar.gz.
File metadata
- Download URL: pdfalive-0.12.0.tar.gz
- Upload date:
- Size: 3.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4de45f6fd0da0a999d12161df6c5045f6cec1ecf98b575a309f1266f6d2c62d0
|
|
| MD5 |
55a9a407b4806ff8da1dd0417e3705f0
|
|
| BLAKE2b-256 |
676c16c20a9e7790148cec58e808a42317fd470af4db6b7d842c1e1646f81259
|
Provenance
The following attestation bundles were made for pdfalive-0.12.0.tar.gz:
Publisher:
publish-to-pypi.yml on promptromp/pdfalive
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdfalive-0.12.0.tar.gz -
Subject digest:
4de45f6fd0da0a999d12161df6c5045f6cec1ecf98b575a309f1266f6d2c62d0 - Sigstore transparency entry: 1191563898
- Sigstore integration time:
-
Permalink:
promptromp/pdfalive@0c9340b013d2db80e06738c8a2b9f6184eeb7d00 -
Branch / Tag:
refs/tags/0.12.0 - Owner: https://github.com/promptromp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@0c9340b013d2db80e06738c8a2b9f6184eeb7d00 -
Trigger Event:
push
-
Statement type:
File details
Details for the file pdfalive-0.12.0-py3-none-any.whl.
File metadata
- Download URL: pdfalive-0.12.0-py3-none-any.whl
- Upload date:
- Size: 200.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ac373ea2a1af9d009b9bb3602b7ad8f8ac0c8bcb35b8214dbb8f577f6d6f3f4
|
|
| MD5 |
bc3aebe47d8990e947d07743813e5db4
|
|
| BLAKE2b-256 |
b4132582517c627e52878b2da20ce9543610a271ba8313442265999aee79b63e
|
Provenance
The following attestation bundles were made for pdfalive-0.12.0-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on promptromp/pdfalive
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdfalive-0.12.0-py3-none-any.whl -
Subject digest:
8ac373ea2a1af9d009b9bb3602b7ad8f8ac0c8bcb35b8214dbb8f577f6d6f3f4 - Sigstore transparency entry: 1191563899
- Sigstore integration time:
-
Permalink:
promptromp/pdfalive@0c9340b013d2db80e06738c8a2b9f6184eeb7d00 -
Branch / Tag:
refs/tags/0.12.0 - Owner: https://github.com/promptromp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@0c9340b013d2db80e06738c8a2b9f6184eeb7d00 -
Trigger Event:
push
-
Statement type: