Skip to main content

MCP server for converting PDF files to Markdown using AI sampling

Project description

Add to Cursor Add to VS Code Add to Claude Add to ChatGPT Add to Codex Add to Gemini

PDF2MD MCP Server

An MCP (Model Context Protocol) server that converts PDF files to Markdown format using AI sampling capabilities.

Features

  • Convert PDF files to Markdown using AI content extraction
  • Support for both local file paths and URLs
  • Incremental conversion - resume from where you left off
  • Configurable output directory
  • Built with FastMCP for high performance

Installation

pip install pdf2md-mcp

Usage

As an MCP Server

Start the server:

pdf2md-mcp

The server will expose MCP tools for PDF to Markdown conversion.

Available Tools

convert_pdf_to_markdown

Converts a PDF file to Markdown format using AI sampling.

Parameters:

  • file_path (string): Local file path or URL to the PDF file
  • output_dir (string, optional): Output directory for the markdown file. Defaults to the same directory as input file (for local files) or current working directory (for URLs)

Returns:

  • output_file: Path to the generated markdown file
  • summary: Summary of the conversion task
  • pages_processed: Number of pages processed

Requirements

  • Python 3.10+
  • An MCP-compatible client with AI sampling capabilities
  • Network access for URL-based PDF files

Development

Setup

git clone https://github.com/shuminghuang/pdf2md-mcp.git
cd pdf2md-mcp
pip install -e ".[dev]"

Running Tests

pytest

Code Formatting

black .
isort .

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf2md_mcp_fastmcp-0.1.2.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf2md_mcp_fastmcp-0.1.2-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file pdf2md_mcp_fastmcp-0.1.2.tar.gz.

File metadata

  • Download URL: pdf2md_mcp_fastmcp-0.1.2.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for pdf2md_mcp_fastmcp-0.1.2.tar.gz
Algorithm Hash digest
SHA256 80ac98baefa9f7d221dce66cd0203e6dc317b488ad289ed4b0684424557053e4
MD5 d96e51c2147612bd13dde8ce162e6994
BLAKE2b-256 9801fb8772393a292b8687e2c2a088ead94e03ecfb91e30602ed2e3b1eceb3be

See more details on using hashes here.

File details

Details for the file pdf2md_mcp_fastmcp-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pdf2md_mcp_fastmcp-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3ed4fa0cae4a011f104de1d9b11b20a2dd90ca96320df2a23f35fb0a6164e9c6
MD5 74c1519ae4edab34415cd390691b4c5e
BLAKE2b-256 b179e2b2649f71ce8514d7c056835aedf5e76a94c2de6b83bbba70738e43370f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page