A comprehensive PDF processing toolkit that converts PDFs to markdown with advanced AI-powered features for image and table analysis. Supports local files and URLs, preserves document structure, extracts high-quality images, detects tables using advanced ML models, and generates detailed content descriptions using multiple LLM providers including OpenAI GPT-4o, Google Gemini, Anthropic Claude, Groq, OpenRouter, and LiteLLM.

These details have not been verified by PyPI

Project links

Homepage

Project description

Markdrop Logo

Markdrop

A Python package for converting PDFs to structured Markdown and interactive HTML, with AI-powered image and table descriptions across six major LLM providers. Available on PyPI.

Features

PDF → Markdown conversion with formatting preservation (via Docling)
Automatic image extraction using XRef IDs
Table detection using Microsoft's Table Transformer
PDF URL support
AI-powered image and table descriptions — 6 providers: Gemini, OpenAI, Anthropic Claude, Groq, OpenRouter, LiteLLM
Interactive HTML output with downloadable Excel tables
Customisable image resolution and UI elements
Structured logging (never pollutes your app's root logger)
Support for DOCX / PPTX input

Installation

Core install (PDF conversion + Gemini/OpenAI):

pip install markdrop

With Anthropic Claude:

pip install "markdrop[anthropic]"

With Groq:

pip install "markdrop[groq]"

With LiteLLM (routes to 100+ providers):

pip install "markdrop[litellm]"

Everything (including local HuggingFace models):

pip install "markdrop[all]"

OpenRouter is accessed through the openai package (already included in core), so no extra install is needed.

Supported AI Providers

Provider	`--ai_provider`	Default model	Vision
Google Gemini	`gemini`	`gemini-3.1-flash-lite`	✅
OpenAI	`openai`	`gpt-5.4`	✅
Anthropic Claude	`anthropic`	`claude-opus-4-6`	✅
Groq	`groq`	`meta-llama/llama-4-maverick-17b-128e-instruct`	✅
OpenRouter	`openrouter`	`google/gemini-3.1-flash-lite` (any model)	✅
LiteLLM	`litellm`	`openai/gpt-5.4` (configurable)	✅

All models are configurable — use --model to override for any provider, or set model_name_override in ProcessorConfig.

Quick Start

CLI Usage

1. Convert PDF → Markdown + HTML

markdrop convert <input_path> --output_dir <dir> [--add_tables]

# Example
markdrop convert report.pdf --output_dir out --add_tables
# Also works with URLs:
markdrop convert https://arxiv.org/pdf/1706.03762 --output_dir out

2. Generate AI Descriptions for Images & Tables

markdrop describe <markdown_file> --ai_provider <provider> [--output_dir <dir>] [--remove_images] [--remove_tables]

Provider	`--ai_provider`
Google Gemini 2.0 Flash	`gemini`
OpenAI GPT-4o	`openai`
Anthropic Claude Opus	`anthropic`
Groq Llama-4 Scout	`groq`
OpenRouter	`openrouter`
LiteLLM	`litellm`

# Gemini (default)
markdrop describe doc.md --ai_provider gemini

# Anthropic Claude
markdrop describe doc.md --ai_provider anthropic --remove_images

# Groq (fastest inference)
markdrop describe doc.md --ai_provider groq

# OpenRouter (any model)
markdrop describe doc.md --ai_provider openrouter

# LiteLLM (unified gateway)
markdrop describe doc.md --ai_provider litellm

3. Set Up API Keys

markdrop setup <provider>

Keys are stored in <package-root>/.env with 0o600 permissions on POSIX systems.

markdrop setup gemini       # → GEMINI_API_KEY
markdrop setup openai       # → OPENAI_API_KEY
markdrop setup anthropic    # → ANTHROPIC_API_KEY
markdrop setup groq         # → GROQ_API_KEY
markdrop setup openrouter   # → OPENROUTER_API_KEY
markdrop setup litellm      # → LITELLM_API_KEY

4. Analyze Images in a PDF

markdrop analyze report.pdf --output_dir pdf_analysis --save_images

5. Batch Image Description Generation

markdrop generate images/ --output_dir descriptions/ --prompt "Describe in detail." \
  --llm_client gemini openai

Available --llm_client values: qwen, gemini, openai, llama-vision, molmo, pixtral

Python API

PDF Conversion

from markdrop import markdrop, MarkDropConfig, add_downloadable_tables
from pathlib import Path
import logging

config = MarkDropConfig(
    image_resolution_scale=2.0,
    download_button_color='#444444',
    log_level=logging.INFO,
    log_dir='logs',
    excel_dir='markdrop-excel-tables',
)

html_path = markdrop("path/to/input.pdf", "output", config)
downloadable_html = add_downloadable_tables(html_path, config)

AI Descriptions

from markdrop import process_markdown, ProcessorConfig, AIProvider, setup_keys

# One-time key setup (writes to .env)
setup_keys('anthropic')

config = ProcessorConfig(
    input_path="doc.md",
    output_dir="output",
    ai_provider=AIProvider.ANTHROPIC,       # GEMINI | OPENAI | ANTHROPIC | GROQ | OPENROUTER | LITELLM
    remove_images=False,
    remove_tables=False,
    table_descriptions=True,
    image_descriptions=True,
    max_retries=3,
    retry_delay=2,
    # Override default models (all providers have matching config fields):
    anthropic_model_name="claude-sonnet-4-5",    # faster / cheaper
    anthropic_text_model_name="claude-sonnet-4-5",
)

output_path = process_markdown(config)

Using OpenRouter to access any model

config = ProcessorConfig(
    input_path="doc.md",
    output_dir="output",
    ai_provider=AIProvider.OPENROUTER,
    openrouter_model_name="meta-llama/llama-4-scout",   # any model on openrouter.ai/models
    openrouter_text_model_name="anthropic/claude-sonnet-4-5",
    openrouter_site_url="https://yoursite.com",
    openrouter_site_name="My App",
)

Using LiteLLM for any 100+ provider

import os
os.environ["ANTHROPIC_API_KEY"] = "..."   # set any provider's key

config = ProcessorConfig(
    input_path="doc.md",
    output_dir="output",
    ai_provider=AIProvider.LITELLM,
    litellm_model_name="anthropic/claude-opus-4-6",
    litellm_text_model_name="groq/llama-3.3-70b-versatile",
)

Batch Image Description Generation

from markdrop import generate_descriptions

generate_descriptions(
    input_path='images/',
    output_dir='output/',
    prompt='Give a highly detailed description of this image.',
    llm_client=['gemini', 'llama-vision'],
)

API Reference

`ProcessorConfig` – AI Provider Fields

Field	Default	Notes
`gemini_model_name`	`gemini-2.0-flash`	Vision model
`gemini_text_model_name`	`gemini-2.0-flash`	Text model
`openai_model_name`	`gpt-4o`	Vision + text
`openai_text_model_name`	`gpt-4o`
`anthropic_model_name`	`claude-opus-4-6`	Vision
`anthropic_text_model_name`	`claude-sonnet-4-5`	Text (cheaper)
`groq_model_name`	`meta-llama/llama-4-scout-17b-16e-instruct`	Vision
`groq_text_model_name`	`llama-3.3-70b-versatile`	Text
`openrouter_model_name`	`google/gemini-2.0-flash-001`	Any model string from openrouter.ai/models
`openrouter_text_model_name`	`anthropic/claude-sonnet-4-5`
`litellm_model_name`	`openai/gpt-4o`	`provider/model` format
`litellm_text_model_name`	`openai/gpt-4o`

`MarkDropConfig`

Field	Default	Notes
`image_resolution_scale`	`2.0`	Scale factor for extracted images
`download_button_color`	`'#444444'`	HTML button colour
`log_level`	`logging.INFO`
`log_dir`	`'logs'`
`excel_dir`	`'markdrop_excel_tables'`

Contributing

We welcome contributions! See CONTRIBUTING.md.

git clone https://github.com/shoryasethia/markdrop.git
cd markdrop
python -m venv venv && source venv/bin/activate   # Windows: venv\Scripts\activate
pip install -e ".[all]"

Project Structure

markdrop/
├── setup.py
├── requirements.txt
├── README.md
└── markdrop/
    ├── __init__.py
    ├── main.py          ← CLI entry-point
    ├── process.py       ← PDF conversion
    ├── parse.py         ← AI description engine (all 6 providers)
    ├── helper.py        ← PDF image analysis
    ├── utils.py         ← PDF download helpers
    ├── setup_keys.py    ← Interactive API key manager
    ├── ignore_warnings.py
    ├── src/
    │   └── markdrop-logo.png
    └── models/
        ├── img_descriptions.py
        ├── model_loader.py  ← Local HF model loader
        ├── responder.py
        └── logger.py

Star History

License

GPL-3.0 — see LICENSE.

Changelog

See CHANGELOG.md.

Support

Open an issue

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

4.0.2

Mar 18, 2026

4.0.1

Mar 10, 2026

4.0.0

Mar 10, 2026

3.5.0

Jul 5, 2025

0.4.0

Mar 10, 2026

0.3.2.1

Apr 5, 2025

0.3.2.0

Feb 21, 2025

0.3.1.3

Jan 29, 2025

0.3.1.2

Jan 29, 2025

0.3.1.1

Jan 29, 2025

0.3.1

Jan 29, 2025

0.3.0

Jan 29, 2025

0.2.8

Dec 28, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdrop-4.0.2.tar.gz (43.0 kB view details)

Uploaded Mar 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

markdrop-4.0.2-py3-none-any.whl (44.0 kB view details)

Uploaded Mar 18, 2026 Python 3

File details

Details for the file markdrop-4.0.2.tar.gz.

File metadata

Download URL: markdrop-4.0.2.tar.gz
Upload date: Mar 18, 2026
Size: 43.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for markdrop-4.0.2.tar.gz
Algorithm	Hash digest
SHA256	`ccaa48cfaf70a26c7848bfd59540852a94580d0fdf024df6736bd355c0e94c4c`
MD5	`7ee0c32fd77dca82747796f02b94c86d`
BLAKE2b-256	`ef964105b869d1ba3c477a9c1447fa6ee2675cbf85aa10aaf62f0a442487dd45`

See more details on using hashes here.

File details

Details for the file markdrop-4.0.2-py3-none-any.whl.

File metadata

Download URL: markdrop-4.0.2-py3-none-any.whl
Upload date: Mar 18, 2026
Size: 44.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.7

File hashes

Hashes for markdrop-4.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`abb62d881496eeacb1a5635ecb9090b60bae22af4bffba4f7241d9daea72b3ec`
MD5	`86047b1abce69b75b342c68c2242b1b5`
BLAKE2b-256	`f1ee25f4d98792aed6cd9cc072754a84d5844dbc17a0aaf87b3929c86b67bc5d`

See more details on using hashes here.

markdrop 4.0.2

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

Markdrop

Features

Installation

Supported AI Providers

Quick Start

CLI Usage

1. Convert PDF → Markdown + HTML

2. Generate AI Descriptions for Images & Tables

3. Set Up API Keys

4. Analyze Images in a PDF

5. Batch Image Description Generation

Python API

PDF Conversion

AI Descriptions

Using OpenRouter to access any model

Using LiteLLM for any 100+ provider

Batch Image Description Generation

API Reference

ProcessorConfig – AI Provider Fields

MarkDropConfig

Contributing

Project Structure

Star History

License

Changelog

Support

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`ProcessorConfig` – AI Provider Fields

`MarkDropConfig`