Skip to main content

A Python-based MCP server that provides powerful document conversion capabilities via Pandoc

Project description

Pandoc MCP Server

License: MIT smithery badge

A Python-based MCP (Model Context Protocol) server that provides powerful document conversion capabilities via Pandoc. This server allows AI agents (like Claude via LangChain/LangGraph) to request file conversions between various formats such as Markdown, DOCX, HTML, PDF, EPUB, and many more.

This project uses:

  • FastMCP: A Python library for easily creating MCP servers.
  • pypandoc: A Python wrapper around the Pandoc command-line tool.
  • Pandoc: The universal document converter.
  • (Optional) Docker: For containerized deployment, bundling all dependencies (Python, Pandoc, LaTeX).

Features

  • Exposes a single MCP tool: convert_document.
  • Supports a wide range of input and output formats handled by Pandoc.
  • Allows specifying input format (if auto-detection fails) and output format.
  • Supports passing extra command-line arguments to Pandoc for advanced control (e.g., Table of Contents, PDF margins, standalone files).
  • Includes Docker configuration (Dockerfile) for creating a self-contained server environment including Pandoc and necessary LaTeX components for PDF generation.
  • Designed for integration with MCP clients, particularly LangChain/LangGraph agents.

Exposed MCP Tool

convert_document

Converts a document from one format to another using Pandoc.

Arguments:

  • input_file_path (str, required): The path accessible by the server to the input document file. If running in Docker with a volume mount, this should be the path inside the container (e.g., /data/my_doc.docx).
  • output_file_path (str, required): The path accessible by the server where the converted output file should be saved. If running in Docker, this should be the path inside the container (e.g., /data/my_output.pdf). The directory will be created if it doesn't exist within the server's accessible filesystem.
  • to_format (str, required): The target format for the conversion (e.g., 'markdown', 'docx', 'pdf', 'html', 'rst', 'epub'). See Pandoc documentation for a full list (--list-output-formats).
  • from_format (str, optional): The format of the input file. If None, pandoc will try to guess from the file extension. Specify if the extension is ambiguous or missing (e.g., 'md', 'docx', 'html'). Defaults to None.
  • extra_args (List[str], optional): A list of additional command-line arguments to pass directly to pandoc (e.g., ['--toc'], ['-V', 'geometry:margin=1.5cm'], ['--standalone']). Defaults to None.

Returns:

  • (str): A message indicating success (e.g., "Successfully converted document to '/data/my_output.pdf'") or an error message (e.g., "Error: Input file not found...", "Error during conversion: Pandoc died...").

Setup and Running

You can run this server either locally (requires manual installation of dependencies) or using the provided Docker configuration (recommended for ease of use and deployment).

Installing via Smithery

To install Pandoc Document Converter for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @MaitreyaM/file-converter-mcp --client claude

Option 1: Running with Docker (Recommended)

This method bundles Python, Pandoc, LaTeX, and required libraries into a container. You only need Docker Desktop installed locally.

  1. Install Docker: Download and install Docker Desktop for your operating system. Start Docker Desktop.
  2. Clone Repository: Get the project files:
    git clone https://github.com/your-username/pandoc-mcp-server.git # Replace with your repo URL
    cd pandoc-mcp-server
    
  3. Build the Docker Image: This command builds the image using the Dockerfile. It installs Pandoc, a capable TeX Live distribution (for PDF support), and Python dependencies inside the image. This step might take several minutes the first time.
    docker build -t pandoc-converter-server .
    
  4. Run the Container: This starts the server inside the container.
    • Choose a directory on your host machine to share with the container for input/output files (e.g., the current project directory).
    • Run the container, mapping the host directory to /data inside the container and mapping port 8000. Replace /path/to/your/local/project with the actual absolute path to the project directory on your machine.
    # Example using the current directory (.) as the host path:
    docker run -it --rm -p 8000:8000 -v "$(pwd)":/data pandoc-converter-server
    
    # Or using an absolute path (replace):
    # docker run -it --rm -p 8000:8000 -v "/path/to/your/local/project":/data pandoc-converter-server
    
    • -it: Runs interactively (shows logs, allows Ctrl+C).
    • --rm: Removes the container when stopped.
    • -p 8000:8000: Maps port 8000 on your host to port 8000 in the container.
    • -v "$(pwd)":/data: Mounts the current working directory on your host to /data inside the container. Files placed in your local project directory will appear in /data inside the container, and files saved to /data by the server will appear in your local project directory.
    • pandoc-converter-server: The name of the image you built.
  5. Server is Running: You should see logs indicating the server started and is listening on SSE (http://0.0.0.0:8000). It's ready to accept connections from your MCP client (like the LangChain agent).
  6. Connecting from Client: Configure your MCP client (e.g., MultiServerMCPClient) to connect to http://127.0.0.1:8000/sse with transport: "sse".
  7. Using the Tool: When interacting with your agent/client, refer to files using their path inside the container, prefixed with /data/. For example: convert /data/my_input.docx to pdf at /data/my_output.pdf. The output file will appear in your local project directory due to the volume mapping.

Option 2: Running Locally (Manual Dependency Installation)

This requires you to install Python, Pandoc, and a LaTeX distribution directly onto your host machine.

  1. Install Python: Ensure you have Python >= 3.10 installed.
  2. Install Pandoc: Install the Pandoc command-line tool for your OS. Follow instructions at pandoc.org/installing.html. Verify by running pandoc --version in a new terminal.
  3. Install LaTeX: For PDF generation, install a TeX distribution.
    • macOS: brew install --cask mactex-no-gui (Recommended via Homebrew)
    • Debian/Ubuntu: sudo apt-get update && sudo apt-get install texlive-latex-base texlive-fonts-recommended texlive-latex-extra texlive-fonts-extra (or texlive-full for everything, but large).
    • Windows: Install MiKTeX or TeX Live. Ensure the bin directory containing pdflatex.exe is added to your system's PATH.
    • Verify by running pdflatex --version in a new terminal.
  4. Clone Repository:
    git clone https://github.com/your-username/pandoc-mcp-server.git # Replace with your repo URL
    cd pandoc-mcp-server
    
  5. Create Virtual Environment (Recommended):
    python -m venv venv
    source venv/bin/activate # Linux/macOS
    # venv\Scripts\activate # Windows
    
    (Or use Conda: conda create --name pandoc-env python=3.11 && conda activate pandoc-env)
  6. Install Python Dependencies:
    pip install -r requirements.txt
    
  7. Run the Server:
    python pandoc_mcp_server.py
    
  8. Server is Running: It will listen on http://127.0.0.1:8000/sse.
  9. Connecting from Client: Configure your MCP client to connect to http://127.0.0.1:8000/sse.
  10. Using the Tool: Refer to files using their regular paths on your local machine (e.g., convert my_input.docx to pdf at my_output.pdf, assuming files are in the same directory, or use absolute paths).

Example Agent Interaction (Running Server in Docker)

Assuming the server container is running with the volume mount:

You: convert /data/report.md to pdf

Agent: Thinking...
[Agent calls convert_document tool with input='/data/report.md', output='/data/report.pdf', to='pdf']
Agent: Successfully converted document to '/data/report.pdf'
[The bot may then attempt to upload report.pdf from the local project directory]

Files

  • pandoc_mcp_server.py: The main Python script for the MCP server.
  • Dockerfile: Instructions for building the Docker container image.
  • requirements.txt: Python dependencies needed inside the Docker container (or local venv).
  • .gitignore: Specifies intentionally untracked files for Git.
  • README.md: This file.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request or open an Issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

File details

Details for the file iflow_mcp_maitreyam_file_converter_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: iflow_mcp_maitreyam_file_converter_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.29 {"installer":{"name":"uv","version":"0.9.29","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_maitreyam_file_converter_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 901b58746d7a00c8e3dd850aa5f545f6cda0b64ff5b4f5c7ecd3ea2079fd876f
MD5 324cda20d44782155b7bff8503e7dba2
BLAKE2b-256 5b1c9eebe8b195bef7044fe8dc704b16a4be1147323057a830c6c592571bde91

See more details on using hashes here.

File details

Details for the file iflow_mcp_maitreyam_file_converter_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_maitreyam_file_converter_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.29 {"installer":{"name":"uv","version":"0.9.29","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_maitreyam_file_converter_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 92fa392924169d02802125def4c3d446d23544de9b21f04a790fecd1564a5131
MD5 436767dacb6773b1a0bbeafbbe3ff28e
BLAKE2b-256 fa0e4b7b47bb0dd6a82295103caa7e910e02df78ab16f93ce9ed25334032d4a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page