Skip to main content

Running Docling as an agent using tools

Project description

Docling MCP: making docling agentic

PyPI version PyPI - Python Version Code style: black Imports: isort Pydantic v2 pre-commit License MIT PyPI Downloads

A document processing service using the Docling-MCP library and MCP (Message Control Protocol) for tool integration.

[!NOTE] This is an unstable draft implementation which will quickly evolve.

Overview

Docling MCP is a service that provides tools for document conversion, processing and generation. It uses the Docling library to convert PDF documents into structured formats and provides a caching mechanism to improve performance. The service exposes functionality through a set of tools that can be called by client applications.

Features

  • conversion tools:
    • PDF document conversion to structured JSON format (DoclingDocument)
  • generation tools:
    • Document generation in DoclingDocument, which can be exported to multiple formats
  • Local document caching for improved performance
  • Support for local files and URLs as document sources
  • Memory management for handling large documents
  • Logging system for debugging and monitoring

Getting started

After installing the dependencies (uv sync), you can expose the tools of Docling by running,

uv run python -m docling_mcp.server

Integration into Claude Desktop

One of the easiest ways to experiment with the tools provided by Docling-MCP is to leverage the Claude Desktop. For that. simply update your Claude Desktop config file (located at ~/Library/Application Support/Claude/claude_desktop_config.json) and add an item (see here) to the mcpServers key.

Converting documents

Convert the PDF document at <provide file-path> into DoclingDocument and return me its document-key.

Generating documents

Example prompt for generation:

I want you to write a Docling document. To do this, you will create a document first by invoking `create_new_docling_document`. Next you can add a title (by invoking `add_title_to_docling_document`) and then iteratively add new section-headings and paragraphs. If you want to insert lists (or nested lists), you will first open a list (by invoking `open_list_in_docling_document`), next add the list_items (by invoking `add_listitem_to_list_in_docling_document`). After adding list-items, you must close the list (by invoking `close_list_in_docling_document`). Nested lists can be created in the same way, by opening and closing additional lists.

During the writing process, you can check what has been written already by calling the `export_docling_document_to_markdown` tool, which will return the currently written document. At the end of the writing, you must save the document and return me the filepath of the saved document.

The document should investigate the impact of tokenizers on the quality of LLM's.

License

The Docling-MCP codebase is under MIT license. For individual model usage, please refer to the model licenses found in the original packages.

LF AI & Data

Docling and Docling-MCP is hosted as a project in the LF AI & Data Foundation.

IBM ❤️ Open Source AI: The project was started by the AI for knowledge team at IBM Research Zurich.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

docling_mcp-0.3.0.tar.gz (13.9 kB view details)

Uploaded Source

Built Distribution

docling_mcp-0.3.0-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file docling_mcp-0.3.0.tar.gz.

File metadata

  • Download URL: docling_mcp-0.3.0.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for docling_mcp-0.3.0.tar.gz
Algorithm Hash digest
SHA256 638c885c7f9e3d863509c34d01ace14a9f2e1878ebb70824ed01057b8e3145f8
MD5 9faa86bf5aa1df1b144d8d1deaeb1766
BLAKE2b-256 7345de5fc1eb65920c26f22110badf502a99ff382a21b360249f3ec0b799dd0a

See more details on using hashes here.

File details

Details for the file docling_mcp-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: docling_mcp-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for docling_mcp-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ed2362a66e42ff31d358fa53b8dba7b170921069ae7b5db2ed564211cea7eabe
MD5 71cfcd66d54e079e1b5fe1da2138b44f
BLAKE2b-256 075550d0306a13122f8a9fc2fd19f0d4c12a18ebd8aa5f103e4ac4ad56f37382

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page