Skip to main content

让AI读懂任何复杂文档 - MCP服务器

Project description

📄 Document Analyzer MCP Server

PyPI version License: MIT Python 3.10+ MCP

让 AI 读懂任何复杂文档 - 解决 AI 上下文限制问题的 MCP 服务器 Make AI understand complex documents - MCP server solving AI context limitations


🌍 语言 / Language


中文文档

🎯 核心功能

  • 智能文档分析 - 自动识别章节结构、处理合并单元格
  • 多格式支持 - Excel (.xlsx, .xls) | PDF/Word 开发中
  • 精确字段定位 - 字段映射表 + 章节级别读取
  • 高效性能 - 结构化缓存 + 按需加载

🚀 快速开始

安装

macOS / Linux (推荐使用 pipx)

# 安装 pipx
brew install pipx  # macOS
# 或 sudo apt install pipx  # Ubuntu/Debian

# 安装 doc-mcp-server
pipx install doc-mcp-server

Windows

pip install doc-mcp-server

更多安装方式请查看 完整安装教程

配置 Claude Code

~/.claude.json 或项目根目录的配置文件中添加:

{
  "mcpServers": {
    "document-analyzer": {
      "command": "doc-mcp-server"
    }
  }
}

详细配置请查看 快速开始指南

📚 完整文档

💡 使用示例

# 1. 分析文档结构
analyze_document(file_path="/path/to/document.xlsx")

# 2. 读取特定章节
read_section(file_path="/path/to/document.xlsx", section_name="第一部分")

# 3. 读取单个字段
read_field(file_path="/path/to/document.xlsx", field_key="第一部分_企业名称")

🤝 贡献与反馈


English Documentation

🎯 Key Features

  • Smart Document Analysis - Auto-detect sections, handle merged cells
  • Multi-format Support - Excel (.xlsx, .xls) | PDF/Word in development
  • Precise Field Mapping - Field mapping table + section-level reading
  • High Performance - Structured caching + lazy loading

🚀 Quick Start

Installation

macOS / Linux (Recommended with pipx)

# Install pipx
brew install pipx  # macOS
# or sudo apt install pipx  # Ubuntu/Debian

# Install doc-mcp-server
pipx install doc-mcp-server

Windows

pip install doc-mcp-server

For more installation options, see Full Installation Guide

Configure Claude Code

Add to ~/.claude.json or your project's config file:

{
  "mcpServers": {
    "document-analyzer": {
      "command": "doc-mcp-server"
    }
  }
}

For detailed configuration, see Quick Start Guide

📚 Full Documentation

💡 Usage Example

# 1. Analyze document structure
analyze_document(file_path="/path/to/document.xlsx")

# 2. Read specific section
read_section(file_path="/path/to/document.xlsx", section_name="Section 1")

# 3. Read single field
read_field(file_path="/path/to/document.xlsx", field_key="Section1_CompanyName")

🤝 Contributing & Feedback


📄 License

MIT License - see LICENSE for details


Made with ❤️ by Yang Jiahui

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doc_mcp_server-0.1.2.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

doc_mcp_server-0.1.2-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file doc_mcp_server-0.1.2.tar.gz.

File metadata

  • Download URL: doc_mcp_server-0.1.2.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for doc_mcp_server-0.1.2.tar.gz
Algorithm Hash digest
SHA256 be179882889fc9a5ab950956d333c1fc92a69c4363c665bc067c635c70a19387
MD5 8a4c553b4e0ca99d53cec55eb916792b
BLAKE2b-256 28586f034d3636a2c7c6a5a209289c82ceac4dbeccedf6b246c23f0f7ffbcc85

See more details on using hashes here.

File details

Details for the file doc_mcp_server-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: doc_mcp_server-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for doc_mcp_server-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 169f9bb0748747807ac80abd46d06fe57cf847a0d0ccbcd0de1f4c7545c1441a
MD5 8f697953d6cab6387355d4925aca2047
BLAKE2b-256 a459723879f428217739fe570f9b95e78435605e6d88ebfdd5b10819d27c0ae8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page