Skip to main content

MCP provider for document parsing and conversion to Markdown

Project description

MCP Document Parse Tool

项目介绍

这是一个MCP(Model Communication Protocol)工具,用于帮助解析各种格式的文档(PDF、Word、Excel、PPT等)获取其内容。该工具提供了简单易用的接口,使您能够在各种应用中集成文档解析功能。

支持的文件格式

  • PDF (.pdf) - 支持可编辑 PDF 和扫描件
  • Word (.doc, .docx)
  • Excel (.xls, .xlsx)
  • PowerPoint (.ppt, .pptx)

安装方法

使用 uv 安装并启动发布版

uv tool install mcp-document-parse

环境变量

计费说明

本工具使用小牛翻译开放平台的文档解析 API,计费规则如下:

文件类型 计费标准
PDF / Word / PPT 1 页 = 2 积分
Excel 2000 字符 = 2 积分

💡 免费额度:平台每天赠送 100 积分,供大家免费使用!

环境要求

  • Python >= 3.9
  • 依赖项已在 pyproject.toml 中定义

MCP 客户端配置示例

若通过 uv tool install 安装,可在 mcp.json 中配置:

{
  "mcpServers": {
    "document_parse": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "tool",
        "run",
        "mcp-document-parse"
      ],
      "env": {
        "NIUTRANS_API_KEY": "${env.NIUTRANS_API_KEY}",
        "NIUTRANS_DOCUMENT_APPID": "${env.NIUTRANS_DOCUMENT_APPID}"
      }
    }
  }
}

启动支持MCP的应用后,执行 ListTools 即可看到 parse_document_by_path 工具,同时支持 ListResources 读取 document://supported-types

工具说明

parse_document_by_path

将指定路径的文件转换为Markdown格式。

参数:

  • file_path (str): 文件的绝对路径,支持pdf、doc、docx、xls、xlsx、ppt、pptx格式

返回:

  • 成功: {"status": "success", "text_content": "文件内容", "filename": 文件名}
  • 失败: {"status": "error", "error": "错误信息"}

document://supported-types

获取支持的文件类型信息。

返回:

  • 包含支持的文件类型列表及其描述的JSON对象

许可证

MIT License

联系方式

如有问题或建议,请联系 tianfengning@niutrans.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_document_parse-0.1.4.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_document_parse-0.1.4-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file mcp_document_parse-0.1.4.tar.gz.

File metadata

  • Download URL: mcp_document_parse-0.1.4.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.8

File hashes

Hashes for mcp_document_parse-0.1.4.tar.gz
Algorithm Hash digest
SHA256 dc7ddc2b5583785083ad2784d176b639c30a657e897f45a10bd7c139509931e0
MD5 aa227cdb56ac38064bd6d42656c0304c
BLAKE2b-256 bc58aa9bbfec3dda3895b2b9e1ed8fac12f6f69f1abcc45dd8e287d11b0ff605

See more details on using hashes here.

File details

Details for the file mcp_document_parse-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_document_parse-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3e4fef47a930093e7216615e20dd59cd582259d8eed8901e8e37b251bedab8fb
MD5 fc073d951f65724508ce10e6323b5926
BLAKE2b-256 7313858f447a7e07467bc873f4073cc40d1d569377fa4a3b0f8039950d976245

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page