Skip to main content

An MCP server for the 'docmind_parser' library.

Project description

DocMind Parser MCP

DocMind Parser MCP 是一个 Python 包和命令行实用程序,用于将各种文件转换为 Markdown 格式,适用于索引、文本分析等场景。

支持的文档格式

  • PDF
  • Word 文档 (doc, docx)
  • PowerPoint 演示文稿 (ppt, pptx)
  • Excel 电子表格 (xls, xlsx, xlsm)
  • 图片 (jpg, jpeg, png, bmp, gif)
  • 其他格式 (markdown, html, epub, mobi, rtf, txt)

支持本地文件和 URL 文件两种方式。

安装

通过 uvx 直接运行(推荐)

uvx docmind-parser-mcp

从源码安装

cd docmind-parse-mcp
pip install .

快速开始

环境变量配置

该工具依赖阿里云文档智能解析服务,需要配置以下环境变量:

export ALIBABA_CLOUD_ACCESS_KEY_ID=YOUR_ALIBABA_CLOUD_ACCESS_KEY_ID
export ALIBABA_CLOUD_ACCESS_KEY_SECRET=YOUR_ALIBABA_CLOUD_ACCESS_KEY_SECRET

启动模式

支持两种启动模式,通过 SERVER_PROTOCOL_MODE 环境变量配置:

  1. stdio 模式(默认):

    export SERVER_PROTOCOL_MODE=stdio
    uvx docmind-parser-mcp
    
  2. SSE 模式

    export SERVER_PROTOCOL_MODE=sse
    uvx docmind-parser-mcp
    

    SSE 模式下,服务将在 http://127.0.0.1:3001/sse 启动。

使用示例

在 MCP 客户端中配置

stdio 模式配置

{
  "mcpServers": {
    "docmind-parser-mcp": {
      "name": "docmind-parser-mcp",
      "command": "uvx",
      "args": [
        "docmind-parser-mcp"
      ],
      "env": {
        "SERVER_PROTOCOL_MODE": "stdio",
        "ALIBABA_CLOUD_ACCESS_KEY_ID": "YOUR_ALIBABA_CLOUD_ACCESS_KEY_ID",
        "ALIBABA_CLOUD_ACCESS_KEY_SECRET": "YOUR_ALIBABA_CLOUD_ACCESS_KEY_SECRET"
      }
    }
  }
}

SSE 模式配置

{
  "mcpServers": {
    "docmind-parser-mcp": {
      "url": "http://127.0.0.1:3001/sse",
      "transportType": "sse"
    }
  }
}

Python 客户端示例

import asyncio
from mcp.client.stdio import stdio_client
from mcp import ClientSession, StdioServerParameters

# 配置服务器参数
server_params = StdioServerParameters(
    command='uvx',
    args=['docmind-parser-mcp'],
    env={
        "SERVER_PROTOCOL_MODE": "stdio",
        "ALIBABA_CLOUD_ACCESS_KEY_ID": "YOUR_ALIBABA_CLOUD_ACCESS_KEY_ID",
        "ALIBABA_CLOUD_ACCESS_KEY_SECRET": "YOUR_ALIBABA_CLOUD_ACCESS_KEY_SECRET"
    }
)

async def main():
    # 创建 stdio 客户端
    async with stdio_client(server_params) as (stdio, write):
        # 创建 ClientSession 对象
        async with ClientSession(stdio, write) as session:
            # 初始化会话
            await session.initialize()

            # 列出可用工具
            response = await session.list_tools()
            print("Available tools:", response)

            # 调用转换工具
            response = await session.call_tool(
                'convert_to_markdown', 
                {'uri': 'https://example.com/document.pdf'}
            )
            print("Conversion result:", response)

if __name__ == '__main__':
    asyncio.run(main())

开发

构建包

cd docmind-parse-mcp
uv build

构建完成后,可以在 dist/ 目录找到生成的包文件。

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

docmind_parser_mcp-0.1.3-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file docmind_parser_mcp-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for docmind_parser_mcp-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e638cbb2a08d4d865ad16818981587bd25fa4efb08ff09395544fa357c46ac45
MD5 4fade848d3a4c4baeed36952a787669e
BLAKE2b-256 3b3074816e07fbe11b09970d755215c4d8af71184a94e5efe1419e7ff8c94c11

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page