Library and CLI to convert PDF documents to clean, well-structured Markdown using LLM-assisted processing, leveraging Antrhopic and OpenAI models for intelligent extraction of text, tables, and complex layouts.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

densom

These details have not been verified by PyPI

Project links

Funding

Project description

PDF to Markdown Converter

Features

Vision Mode: Enhanced extraction using multimodal AI for complex layouts, tables, charts, and diagrams
Multi-Provider Support: Use Anthropic (Claude) or OpenAI (GPT) models
Smart Conversion: Intelligently converts PDF content to clean markdown with proper formatting
Large File Support: Automatically chunks large PDFs for optimal processing
Batch Processing: Convert entire folders of PDFs with preserved directory structure
Table Preservation: Accurately converts tables to markdown format with vision-enhanced detection
Structure Detection: Automatically generates appropriate heading hierarchy
Dual Interface: Use as both a CLI tool and a Python library

Quick Start

# 1. Install with uv (recommended - faster)
uv tool install pdf-to-md-llm

# 2. Set your API key
export ANTHROPIC_API_KEY='your-api-key-here'

# 3. Convert a PDF
pdf-to-md-llm convert document.pdf --vision

Installation

Using uv (Recommended)

uv is a fast Python package installer:

# Install the package as a tool
uv tool install pdf-to-md-llm

# Or run directly without installing
uvx pdf-to-md-llm convert document.pdf

Using pip (Alternative)

pip install pdf-to-md-llm

Configuration

Set your API key for at least one provider:

# For Anthropic (Claude) - recommended
export ANTHROPIC_API_KEY='your-anthropic-api-key-here'

# For OpenAI (GPT)
export OPENAI_API_KEY='your-openai-api-key-here'

Or create a .env.local file:

ANTHROPIC_API_KEY=your-anthropic-api-key-here
OPENAI_API_KEY=your-openai-api-key-here

Default Models (Optimized for Cost/Quality)

The tool uses cost-effective models by default:

Anthropic: claude-3-5-haiku-20241022 ($0.80 input / $4 output per million tokens)
OpenAI: gpt-4o-mini ($0.15 input / $0.60 output per million tokens)

These defaults provide excellent quality for most PDF conversion tasks at significantly lower cost. For complex documents requiring maximum accuracy, you can override with premium models:

# Use more powerful Anthropic model for complex documents
pdf-to-md-llm convert complex-doc.pdf --model claude-sonnet-4-20250514 --vision

# Use OpenAI's flagship model
pdf-to-md-llm convert complex-doc.pdf --provider openai --model gpt-4o --vision

To see all available models from your configured providers, see List Available Models.

Usage Examples

Basic Conversion

# Simple document conversion
pdf-to-md-llm convert document.pdf

# Specify output filename
pdf-to-md-llm convert document.pdf output.md

Scenario 1: Academic Papers with Tables

For research papers, technical documents, or any PDF with complex tables:

# Vision mode provides superior table extraction
pdf-to-md-llm convert research-paper.pdf --vision

Scenario 2: Large Documents (500+ pages)

For textbooks, manuals, or large documents, use smaller chunks for better processing:

# Reduce chunk size for memory efficiency
pdf-to-md-llm convert textbook.pdf --vision --vision-pages-per-chunk 4

Scenario 3: Documents with Charts and Diagrams

For PDFs containing visual elements like charts, graphs, or diagrams:

# Vision mode analyzes images and describes visual content
pdf-to-md-llm convert annual-report.pdf --vision --vision-dpi 200

# Use vision-only mode to rely solely on image analysis (no extracted text)
# Useful for PDFs where text extraction is unreliable or when you want pure visual analysis
pdf-to-md-llm convert diagram-heavy.pdf --vision-only --vision-dpi 200

Scenario 4: Using OpenAI GPT Models

Switch to OpenAI for different model capabilities:

# Use GPT-4o for conversion
pdf-to-md-llm convert document.pdf --provider openai --model gpt-4o --vision

# Use GPT-4o-mini for cost savings
pdf-to-md-llm convert document.pdf --provider openai --model gpt-4o-mini

Scenario 5: Batch Processing Multiple Documents

Convert entire folders of PDFs:

# Convert all PDFs in a folder (single-threaded)
pdf-to-md-llm batch ./research-papers

# With custom output folder and vision mode
pdf-to-md-llm batch ./input-pdfs ./output-markdown --vision

# Skip files that already have .md output (useful for resuming interrupted batches)
pdf-to-md-llm batch ./pdfs --skip-existing --vision

# Batch with OpenAI
pdf-to-md-llm batch ./pdfs --provider openai --vision

# Use multithreading for faster batch conversion (2 threads)
pdf-to-md-llm batch ./pdfs --threads 2 --vision

# Use 4 threads for even faster processing
pdf-to-md-llm batch ./pdfs --threads 4 --vision

# Maximum parallelization (be mindful of API rate limits)
pdf-to-md-llm batch ./large-batch --threads 8

# Combine skip-existing with multithreading for efficient resumption
pdf-to-md-llm batch ./large-batch --skip-existing --threads 4 --vision

Multithreading Benefits:

Dramatically reduces total conversion time for large batches
Efficiently utilizes multi-core processors
Thread count can be adjusted based on system resources and API rate limits
Default is single-threaded (1 thread) to avoid rate limit issues

Scenario 6: Simple Text Documents

For PDFs with simple text layout (no tables or complex formatting), standard mode is faster and more cost-effective:

# Standard mode (no vision) - faster and cheaper
pdf-to-md-llm convert simple-doc.pdf

# Adjust chunk size for standard mode
pdf-to-md-llm convert simple-doc.pdf --pages-per-chunk 10

Getting Help

# Check the installed version
pdf-to-md-llm --version

# Show all available options
pdf-to-md-llm --help

# Show help for specific commands
pdf-to-md-llm convert --help
pdf-to-md-llm batch --help
pdf-to-md-llm models --help

List Available Models

Check which AI models are available from your configured providers:

# List all available models from all configured providers
pdf-to-md-llm models

# List models from a specific provider
pdf-to-md-llm models --provider anthropic
pdf-to-md-llm models --provider openai

The models command will:

Show available models from providers that have API keys configured
Display the default model for each provider
Only query providers with valid API keys in your environment

Using as a Python Library

First, add the package to your project:

# Using uv (recommended)
uv add pdf-to-md-llm

# Or using pip
pip install pdf-to-md-llm

Then import and use in your Python code:

from pdf_to_md_llm import convert_pdf_to_markdown, batch_convert

# Convert with vision mode (recommended for complex layouts)
markdown_content = convert_pdf_to_markdown(
    pdf_path="document.pdf",
    output_path="output.md",  # Optional
    provider="anthropic",  # 'anthropic' or 'openai'
    use_vision=True,  # Enable vision mode
    pages_per_chunk=8,  # Pages per chunk (vision default: 8)
    verbose=True  # Show progress
)

# Convert with vision-only mode (no extracted text, just images)
markdown_content = convert_pdf_to_markdown(
    pdf_path="scanned-document.pdf",
    provider="anthropic",
    vision_only=True,  # Only use images, skip extracted text
    vision_dpi=200,  # Higher DPI for better quality
    verbose=True
)

# Use OpenAI with custom model
markdown_content = convert_pdf_to_markdown(
    pdf_path="document.pdf",
    provider="openai",
    model="gpt-4o",
    use_vision=True,
    api_key="your-openai-key"  # Optional if env var set
)

# Batch convert all PDFs in a folder
batch_convert(
    input_folder="./pdfs",
    output_folder="./markdown",  # Optional
    provider="anthropic",
    use_vision=True,
    verbose=True
)

# Batch convert with multithreading for faster processing
batch_convert(
    input_folder="./pdfs",
    output_folder="./markdown",
    provider="anthropic",
    use_vision=True,
    threads=4,  # Use 4 threads for parallel processing
    verbose=True
)

# Batch convert with skip_existing to resume interrupted batches
batch_convert(
    input_folder="./pdfs",
    output_folder="./markdown",
    provider="anthropic",
    use_vision=True,
    skip_existing=True,  # Skip files that already have .md output
    threads=4,
    verbose=True
)

Advanced Library Usage

from pdf_to_md_llm import extract_text_from_pdf, extract_pages_with_vision, chunk_pages

# Extract text only (standard mode)
pages = extract_text_from_pdf("document.pdf")
print(f"Found {len(pages)} pages")

# Extract with vision data (text + images)
vision_pages = extract_pages_with_vision("document.pdf", dpi=150)
for page in vision_pages:
    print(f"Page {page['page_num']}: has_tables={page['has_tables']}, has_images={page['has_images']}")

# Create custom chunks
chunks = chunk_pages(pages, pages_per_chunk=5)
print(f"Created {len(chunks)} chunks")

How It Works

Standard Mode

Text Extraction: Extracts text from PDF using PyMuPDF
Chunking: Breaks content into manageable chunks (default: 5 pages per chunk)
LLM Processing: Sends each chunk to your chosen AI provider for intelligent markdown conversion
Reassembly: Combines all chunks into a single, formatted markdown document

Vision Mode (Recommended)

Multimodal Extraction: Extracts both text and renders page images from PDF
Smart Chunking: Groups pages into larger chunks (default: 8 pages) for better context
Visual Analysis: AI analyzes both text and images for superior layout understanding
Enhanced Accuracy: Better detection of tables, charts, diagrams, and complex layouts
Reassembly: Combines chunks with intelligent deduplication of headers/footers

When to use Vision Mode:

Documents with tables or complex layouts
PDFs containing charts, diagrams, or visual elements
Academic papers or technical documentation
Any document where layout matters

Vision-Only Mode:

Use --vision-only flag to send only page images to the AI without extracted text. This mode:

Relies completely on visual analysis of page images
Useful when PDF text extraction produces garbled or unreliable text
Better for image-heavy documents, scanned PDFs, or when layout is critical
Still uses chunking (controlled by --vision-pages-per-chunk)
Automatically enables --vision mode

Performance Tips

Choosing Between Standard and Vision Mode

Use Vision Mode when:

PDF contains tables, charts, or diagrams
Layout and formatting are important
You need accurate table extraction
Document has complex multi-column layouts

Use Vision-Only Mode when:

Text extraction produces garbled or unreliable output
Working with scanned PDFs or images embedded in PDFs
Visual layout is more important than extracted text
You want pure AI visual analysis without text hints

Use Standard Mode when:

Simple text-only documents
Speed and cost are priorities
Document has straightforward single-column layout

Chunk Size Optimization

Larger chunks (8-10 pages):

Better context for the AI model
More efficient API usage
Better for documents with consistent formatting
Default for vision mode

Smaller chunks (3-5 pages):

Better for very large documents (500+ pages)
Reduces memory usage
Helpful when hitting API token limits
Default for standard mode

Vision Mode Settings

DPI Settings:

Default (150 DPI): Good balance of quality and performance
High quality (200-300 DPI): For small text or detailed diagrams
Lower (100 DPI): Faster processing, suitable for simple layouts

Adjusting chunk size in vision mode:

# Smaller chunks for very large documents
pdf-to-md-llm convert large.pdf --vision --vision-pages-per-chunk 4

# Larger chunks for better context
pdf-to-md-llm convert doc.pdf --vision --vision-pages-per-chunk 12

# Vision-only mode with custom chunk size
pdf-to-md-llm convert scanned.pdf --vision-only --vision-pages-per-chunk 6

Troubleshooting

API Key Errors

Error: ValueError: API key not found

Solution:

Verify your API key is set in environment variables
Check the key name matches your provider (ANTHROPIC_API_KEY or OPENAI_API_KEY)
Ensure the key is valid and not expired

Rate Limiting

Error: API rate limit exceeded

Solution:

Reduce chunk size to make smaller API requests
Add delays between batch conversions
Upgrade your API plan for higher limits
Switch providers if one is experiencing issues

Large File Issues

Error: Memory errors or timeouts on large PDFs

Solution:

Use smaller chunk sizes: --vision-pages-per-chunk 3
Process in batches by splitting the PDF first
Use standard mode instead of vision for simple documents
Increase available system memory

Vision Mode Memory Issues

Error: Out of memory when using vision mode

Solution:

Reduce DPI: --vision-dpi 100
Use smaller chunks: --vision-pages-per-chunk 4
Process fewer pages at once
Close other applications to free memory

Poor Quality Output

Problem: Markdown output has formatting issues

Solution:

Try vision mode for better layout detection: --vision
Increase DPI for better image quality: --vision-dpi 200
Try vision-only mode if extracted text is garbled: --vision-only
Try different models: --provider openai --model gpt-4o
Adjust chunk size for better context

API Reference

Main Functions

`convert_pdf_to_markdown()`

def convert_pdf_to_markdown(
    pdf_path: str,
    output_path: Optional[str] = None,
    pages_per_chunk: int = 5,
    provider: str = "anthropic",
    api_key: Optional[str] = None,
    model: Optional[str] = None,
    max_tokens: int = 4000,
    verbose: bool = True,
    use_vision: bool = False,
    vision_dpi: int = 150,
    vision_only: bool = False
) -> str

Convert a single PDF to markdown.

Parameters:

pdf_path: Path to the PDF file
output_path: Optional output file path (defaults to PDF name with .md extension)
pages_per_chunk: Number of pages per API call (default: 5 for standard, 8 for vision)
provider: AI provider - 'anthropic' or 'openai' (default: 'anthropic')
api_key: API key (defaults to provider-specific environment variable)
model: Model to use (optional, uses provider defaults)
max_tokens: Maximum tokens per API call (default: 4000)
verbose: Print progress messages (default: True)
use_vision: Enable vision mode for better extraction (default: False)
vision_dpi: DPI for page images in vision mode (default: 150)
vision_only: Use only images without extracted text (default: False, automatically enables use_vision)

Returns: The complete markdown content as a string

Raises: ValueError if API key is missing or provider is invalid

`batch_convert()`

def batch_convert(
    input_folder: str,
    output_folder: Optional[str] = None,
    pages_per_chunk: int = 5,
    provider: str = "anthropic",
    api_key: Optional[str] = None,
    model: Optional[str] = None,
    max_tokens: int = 4000,
    verbose: bool = True,
    use_vision: bool = False,
    vision_dpi: int = 150,
    vision_only: bool = False,
    threads: int = 1,
    skip_existing: bool = False
) -> None

Convert all PDFs in a folder to markdown.

Parameters:

input_folder: Folder containing PDF files
output_folder: Optional output folder (defaults to input folder)
vision_only: Use only images without extracted text (default: False, automatically enables use_vision)
threads: Number of threads for parallel processing (default: 1 for single-threaded)
skip_existing: Skip files that already have corresponding .md files in output directory (default: False)
All other parameters same as convert_pdf_to_markdown()

Note on Multithreading:

Single-threaded (threads=1): Default, sequential processing
Multithreaded (threads>1): Parallel processing for faster batch conversion
Be mindful of API rate limits when using higher thread counts
Progress output is simplified in multithreaded mode for clarity

`extract_text_from_pdf()`

def extract_text_from_pdf(pdf_path: str) -> List[str]

Extract raw text from PDF (standard mode).

Returns: List of strings, one per page

`extract_pages_with_vision()`

def extract_pages_with_vision(pdf_path: str, dpi: int = 150) -> List[Dict[str, Any]]

Extract text and images from PDF pages for vision processing.

Returns: List of dicts with keys: page_num, text, image_base64, has_images, has_tables

`chunk_pages()`

def chunk_pages(pages: List[str], pages_per_chunk: int) -> List[str]

Combine pages into chunks for processing.

Returns: List of combined page chunks

Output Format

Converted markdown files include:

Document title header
Clean heading hierarchy
Properly formatted tables
Organized lists
Removed page numbers and PDF artifacts
Conversion metadata footer

Requirements

Python 3.9 or higher
API key for at least one provider:
- Anthropic API key (for Claude models)
- OpenAI API key (for GPT models)

Dependencies

All dependencies are automatically installed:

anthropic: Claude API client (for Anthropic provider)
openai: OpenAI API client (for OpenAI provider)
pymupdf: PDF text and image extraction
python-dotenv: Environment variable management
click: CLI framework

License

This project is open source and available under the MIT License.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for development setup, testing, and contribution guidelines.

For bug reports and feature requests, please open an issue on GitHub.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

densom

These details have not been verified by PyPI

Project links

Funding

Release history Release notifications | RSS feed

This version

2.7.1

Oct 21, 2025

2.7.0

Oct 9, 2025

2.6.0

Oct 8, 2025

2.5.0

Oct 8, 2025

2.4.0

Oct 8, 2025

2.3.0

Oct 7, 2025

2.2.1

Oct 7, 2025

2.2.0

Oct 7, 2025

2.1.0

Oct 6, 2025

2.0.2

Oct 6, 2025

2.0.1

Oct 6, 2025

2.0.0

Oct 6, 2025

1.1.0

Oct 6, 2025

1.0.5

Oct 5, 2025

1.0.4

Oct 4, 2025

1.0.3

Oct 4, 2025

1.0.2

Oct 4, 2025

1.0.1

Oct 4, 2025

1.0.0

Oct 4, 2025

0.3.0

Oct 4, 2025

0.2.0

Oct 3, 2025

0.1.2

Oct 3, 2025

0.1.1

Oct 3, 2025

0.1.0

Oct 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_to_md_llm-2.7.1.tar.gz (46.8 MB view details)

Uploaded Oct 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdf_to_md_llm-2.7.1-py3-none-any.whl (26.9 kB view details)

Uploaded Oct 21, 2025 Python 3

File details

Details for the file pdf_to_md_llm-2.7.1.tar.gz.

File metadata

Download URL: pdf_to_md_llm-2.7.1.tar.gz
Upload date: Oct 21, 2025
Size: 46.8 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.4

File hashes

Hashes for pdf_to_md_llm-2.7.1.tar.gz
Algorithm	Hash digest
SHA256	`c8eda96631fa821eb74929622f3d8f3a318bbe0d773da45345924d41e2374f03`
MD5	`e702e6cfb3ab8a662ddf99f9c1d08055`
BLAKE2b-256	`60b3b79441d940a141e2f3dd06bae59196a8dcc3ffff4dc0a1445662326d8bf5`

See more details on using hashes here.

File details

Details for the file pdf_to_md_llm-2.7.1-py3-none-any.whl.

File metadata

Download URL: pdf_to_md_llm-2.7.1-py3-none-any.whl
Upload date: Oct 21, 2025
Size: 26.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.9.4

File hashes

Hashes for pdf_to_md_llm-2.7.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6d23d5ae31a18579918dce5bded35208856da255a6c50c612347297c9f3f9c29`
MD5	`bbc5af060d4e0222a22a6be460b8ca39`
BLAKE2b-256	`1d094556012504fb9c29f310abb2338cad6f7e2410e43e56cdbd9664eab736aa`

See more details on using hashes here.

pdf-to-md-llm 2.7.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PDF to Markdown Converter

Features

Quick Start

Installation

Using uv (Recommended)

Using pip (Alternative)

Configuration

Default Models (Optimized for Cost/Quality)

Usage Examples

Basic Conversion

Scenario 1: Academic Papers with Tables

Scenario 2: Large Documents (500+ pages)

Scenario 3: Documents with Charts and Diagrams

Scenario 4: Using OpenAI GPT Models

Scenario 5: Batch Processing Multiple Documents

Scenario 6: Simple Text Documents

Getting Help

List Available Models

Using as a Python Library

Advanced Library Usage

How It Works

Standard Mode

Vision Mode (Recommended)

Performance Tips

Choosing Between Standard and Vision Mode

Chunk Size Optimization

Vision Mode Settings

Troubleshooting

API Key Errors

Rate Limiting

Large File Issues

Vision Mode Memory Issues

Poor Quality Output

API Reference

Main Functions

convert_pdf_to_markdown()

batch_convert()

extract_text_from_pdf()

extract_pages_with_vision()

chunk_pages()

Output Format

Requirements

Dependencies

License

Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`convert_pdf_to_markdown()`

`batch_convert()`

`extract_text_from_pdf()`

`extract_pages_with_vision()`

`chunk_pages()`