Convert PDF files to high-quality Markdown using LLM vision models

These details have not been verified by PyPI

Project links

Project description

pdfmark-ai

PDF to Markdown, powered by LLM vision.

Drop a PDF, get clean Markdown — tables, formulas, code, figures, all handled.

PyPI version Python License GitHub commit activity

Demo · Installation · Quick Start · Configuration · CLI Reference

English | 简体中文

pdfmark-ai doesn't parse PDFs the traditional way. Instead, it renders each page as an image and lets multimodal LLMs (Claude, Kimi, Qwen, etc.) "read" it — just like a human would. The result? Clean, structured Markdown that handles what other tools simply can't: complex tables with merged cells, inline math formulas, source code blocks, embedded diagrams, and even blurry scans.

Demo

Real conversion results on academic papers and technical documents — no post-editing.

Image Extraction, Tables & Code


PDF original — mixed figures, tables & code	Converted Markdown — images extracted, tables formatted


PDF original — tables & code blocks	Converted Markdown — syntax-highlighted code


PDF original — charts & formulas	Converted Markdown — chart images referenced

Math Formulas & Blurred Content


PDF original — dense math formulas	Converted Markdown — LaTeX `$...$` and `$$...$$` wrapping


PDF original — blurred / low-quality scan	Converted Markdown — content correctly recognized

Features

🖼️ Vision-based extraction — treats each page as an image, handles complex layouts that traditional parsers miss
🧮 Math formulas — LaTeX rendering with automatic $...$ and $$...$$ wrapping
📊 Complex tables — merged cells, multi-row headers, nested structures
💻 Code blocks — syntax-appropriate formatting for source code
✂️ Image extraction — --crop-images to crop figures and diagrams as separate files
🔍 Blur tolerance — handles low-quality and blurred scans with high recognition accuracy
🤖 Multi-provider — Claude, Kimi, Xiaomi, Qwen, and any OpenAI-compatible API
⚡ Incremental caching — SHA-256 progressive cache avoids re-processing unchanged pages

Installation

pip install pdfmark-ai

Requirements

Python >= 3.10
An LLM API key (Anthropic, Kimi, Qwen, or OpenAI-compatible)

Quick Start

# Step 1: Generate config templates in your current directory
pdfmark --init

# Step 2: Edit .env — uncomment ONE provider and fill in your API key
#   e.g. LLM_API_KEY=your-kimi-api-key

# Step 3: Run
pdfmark input.pdf -o output.md

Default provider is Kimi (kimi-for-coding). To use a different provider, either edit .env to set LLM_MODEL / LLM_BASE_URL, or change active_provider in pdfmark.toml.

💡 Tip: Configuration files (.env and pdfmark.toml) are always read from your current working directory — not from the package installation directory. Place them alongside your PDF files or in your project root.

Configuration

pdfmark-ai uses a 4-layer priority chain: CLI args > env vars > TOML config > defaults.

Config files live in your working directory (where you run pdfmark):

File	Purpose	Contains
`.env`	API keys & overrides	`LLM_API_KEY`, `LLM_MODEL`, `LLM_BASE_URL`
`pdfmark.toml`	Provider presets & settings	providers, DPI, chunking, caching

You can generate both files with pdfmark --init, or create them manually.

.env (API keys)

# Uncomment ONE provider and add your key:
LLM_API_KEY=your-kimi-api-key
# LLM_API_KEY=your-anthropic-api-key
# LLM_API_KEY=your-qwen-api-key

# Optional: override model or base URL
# LLM_MODEL=claude-sonnet-4-20250514
# LLM_BASE_URL=https://api.your-provider.com/v1

pdfmark.toml (settings)

active_provider = "anthropic"

[providers.anthropic]
base_url = "https://api.anthropic.com"
model = "claude-sonnet-4-20250514"

[render]
dpi = 150

[cache]
enabled = true
dir = "~/.cache/pdfmark"

Supported Providers

Provider	`active_provider`	Notes
Anthropic Claude	`anthropic`	Supports Opus 4.6, Sonnet 4.6 and other Claude models. Uses Anthropic Messages API natively.
Kimi (Moonshot)	`kimi`	Anthropic-compatible API
Xiaomi (MiMo)	`xiaomi`	Auth token required
Qwen (Alibaba)	`qwen`	OpenAI-compatible SDK
Any OpenAI-compatible	set `LLM_BASE_URL`	Set `LLM_SDK_TYPE=openai`

CLI Reference

Usage: pdfmark [OPTIONS] [INPUT]

Arguments:
  INPUT                   Path to the PDF file to convert

Options:
  --init                  Generate .env and pdfmark.toml config templates
  -f, --force             Overwrite existing config files (use with --init)
  -o, --output            Output markdown file path
  --lang                  Document language (e.g. 'en', 'zh', 'auto')
  --crop-images           Extract visual regions from pages as images
  --refine                Run optional LLM global refinement pass
  --no-cache              Disable caching of rendered pages and chunks
  --no-frontmatter        Omit YAML frontmatter from output
  --detect-only           Detect document structure and print sections
  --config                Path to a TOML configuration file
  --dpi                   Rendering DPI for PDF pages
  --model                 LLM model identifier
  --api-key               LLM API key (or set LLM_API_KEY env var)
  --base-url              LLM API base URL
  --max-concurrent        Maximum concurrent LLM requests

Image Extraction

Use --crop-images to extract figures and diagrams from the PDF as separate image files:

pdfmark input.pdf -o output.md --crop-images

Cropped images are saved alongside the output file (e.g., images/page_003_fig_001.png). Crop mode and plain mode use separate caches, so you can switch freely without needing --no-cache.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.1

Apr 29, 2026

0.6.0

Apr 22, 2026

0.5.2

Apr 20, 2026

0.5.1

Apr 20, 2026

This version

0.5.0

Apr 15, 2026

0.4.0

Apr 15, 2026

0.3.1

Apr 13, 2026

0.3.0

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfmark_ai-0.5.0.tar.gz (45.6 kB view details)

Uploaded Apr 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdfmark_ai-0.5.0-py3-none-any.whl (36.4 kB view details)

Uploaded Apr 15, 2026 Python 3

File details

Details for the file pdfmark_ai-0.5.0.tar.gz.

File metadata

Download URL: pdfmark_ai-0.5.0.tar.gz
Upload date: Apr 15, 2026
Size: 45.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for pdfmark_ai-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`b21770c6bd967be6f0972ecae1d1dca92dd304cf4ba91c71e8667c7193053d1f`
MD5	`13fca365c79866cedfeb5b48db1356fc`
BLAKE2b-256	`894019c2c161052baf4aa9a3197a2d523ac883ce9ce363e14639702608d49bba`

See more details on using hashes here.

File details

Details for the file pdfmark_ai-0.5.0-py3-none-any.whl.

File metadata

Download URL: pdfmark_ai-0.5.0-py3-none-any.whl
Upload date: Apr 15, 2026
Size: 36.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for pdfmark_ai-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4d893b60846de4a303c93a447b4e69e8b2a36e760e508c57e3089a22eebc0c7e`
MD5	`5dc623e0fd90a3c32c2a2397deb62416`
BLAKE2b-256	`9c599345880cf5a47f409694eef87d8ff6af345b8cdfe4976d44935e99c06964`

See more details on using hashes here.

pdfmark-ai 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pdfmark-ai

Demo

Image Extraction, Tables & Code

Math Formulas & Blurred Content

Features

Installation

Requirements

Quick Start

Configuration

.env (API keys)

pdfmark.toml (settings)

Supported Providers

CLI Reference

Image Extraction

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes