Skip to main content

Running Docling as an agent using tools

Project description

Docling

Docling MCP: making docling agentic

PyPI version PyPI - Python Version uv Ruff Pydantic v2 pre-commit License MIT PyPI Downloads LF AI & Data

A document processing service using the Docling-MCP library and MCP (Message Control Protocol) for tool integration.

Overview

Docling MCP is a service that provides tools for document conversion, processing and generation. It uses the Docling library to convert PDF documents into structured formats and provides a caching mechanism to improve performance. The service exposes functionality through a set of tools that can be called by client applications.

Features

  • Conversion tools:
    • PDF document conversion to structured JSON format (DoclingDocument)
  • Generation tools:
    • Document generation in DoclingDocument, which can be exported to multiple formats
  • Local document caching for improved performance
  • Support for local files and URLs as document sources
  • Memory management for handling large documents
  • Logging system for debugging and monitoring
  • RAG applications with Milvus upload and retrieval

Getting started

The easiest way to install Docling MCP is connect it to your client is launching it via uvx.

Depending on the transfer protocol required, specify the argument --transport, for example

  • stdio used e.g. in Claude for Desktop and LM Studio

    uvx --from docling-mcp docling-mcp-server --transport stdio
    
  • sse used e.g. in Llama Stack

    uvx --from docling-mcp docling-mcp-server --transport sse
    
  • streamable-http used e.g. in containers setup

    uvx --from docling-mcp docling-mcp-server --transport streamable-http
    

More options are available, e.g. the selection of which toolgroup to launch. Use the --help argument to inspect all the CLI options.

For developing the MCP tools further, please refer to the docs/development.md page for instructions.

Integration with MCP clients

One of the easiest ways to experiment with the tools provided by Docling MCP is to leverage an AI desktop client with MCP support. Most of these clients use a common config interface. Adding Docling MCP in your favorite client is usually as simple as adding the following entry in the configuration file.

{
  "mcpServers": {
    "docling": {
      "command": "uvx",
      "args": [
        "--from=docling-mcp",
        "docling-mcp-server"
      ]
    }
  }
} 

When using Claude for Desktop, simply edit the config file claude_desktop_config.json with the snippet above or the example provided here.

In LM Studio, edit the mcp.json file with the appropriate section or simply clik on the button below for a direct install.

Add MCP Server docling to LM Studio

Other integrations are described in ./docs/integrations/.

Examples

Converting documents

Example of prompt for converting PDF documents:

Convert the PDF document at <provide file-path> into DoclingDocument and return its document-key.

Generating documents

Example of prompt for generating new documents:

I want you to write a Docling document. To do this, you will create a document first by invoking `create_new_docling_document`. Next you can add a title (by invoking `add_title_to_docling_document`) and then iteratively add new section-headings and paragraphs. If you want to insert lists (or nested lists), you will first open a list (by invoking `open_list_in_docling_document`), next add the list_items (by invoking `add_listitem_to_list_in_docling_document`). After adding list-items, you must close the list (by invoking `close_list_in_docling_document`). Nested lists can be created in the same way, by opening and closing additional lists.

During the writing process, you can check what has been written already by calling the `export_docling_document_to_markdown` tool, which will return the currently written document. At the end of the writing, you must save the document and return me the filepath of the saved document.

The document should investigate the impact of tokenizers on the quality of LLMs.

License

The Docling MCP codebase is under MIT license. For individual model usage, please refer to the model licenses found in the original packages.

LF AI & Data

Docling and Docling MCP is hosted as a project in the LF AI & Data Foundation.

IBM ❤️ Open Source AI: The project was started by the AI for knowledge team at IBM Research Zurich.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docling_mcp-1.0.1.tar.gz (17.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docling_mcp-1.0.1-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file docling_mcp-1.0.1.tar.gz.

File metadata

  • Download URL: docling_mcp-1.0.1.tar.gz
  • Upload date:
  • Size: 17.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for docling_mcp-1.0.1.tar.gz
Algorithm Hash digest
SHA256 698071d79279b15afd1bf2c5ba4e35dd0f266428620bf60d9f80c72096a84a48
MD5 dbe5276e36c25c38b8fd7e29123a5e67
BLAKE2b-256 30b0720a520369aab9bf58a58dcaaa252c06091836c8eda26c4e3b835d154c87

See more details on using hashes here.

File details

Details for the file docling_mcp-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: docling_mcp-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for docling_mcp-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5ec0c331cb4f3b3bfefe78bc0781ac85a28a7a9a8edb32d36c48d643ed37340d
MD5 3c09951d4d53eb6b69fc1dcdd8058d77
BLAKE2b-256 ae020072c4a6e94d3b4f8d947b51c5066e894e234f135fca3d35a45835047327

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page