让AI读懂任何复杂文档 - MCP服务器
Project description
📄 Document Analyzer MCP Server
让 AI 读懂任何复杂文档 - 解决 AI 上下文限制问题的 MCP 服务器 Make AI understand complex documents - MCP server solving AI context limitations
🌍 语言 / Language
中文文档
🎯 核心功能
- ✅ 智能文档分析 - 自动识别章节结构、处理合并单元格
- ✅ 多格式支持 - Excel (.xlsx, .xls) | PDF/Word 开发中
- ✅ 精确字段定位 - 字段映射表 + 章节级别读取
- ✅ 高效性能 - 结构化缓存 + 按需加载
🚀 快速开始
安装
macOS / Linux (推荐使用 pipx)
# 安装 pipx
brew install pipx # macOS
# 或 sudo apt install pipx # Ubuntu/Debian
# 安装 doc-mcp-server
pipx install doc-mcp-server
Windows
pip install doc-mcp-server
更多安装方式请查看 完整安装教程
配置 Claude Code
在 ~/.claude.json 或项目根目录的配置文件中添加:
{
"mcpServers": {
"document-analyzer": {
"command": "doc-mcp-server"
}
}
}
详细配置请查看 快速开始指南
📚 完整文档
💡 使用示例
# 1. 分析文档结构
analyze_document(file_path="/path/to/document.xlsx")
# 2. 读取特定章节
read_section(file_path="/path/to/document.xlsx", section_name="第一部分")
# 3. 读取单个字段
read_field(file_path="/path/to/document.xlsx", field_key="第一部分_企业名称")
🤝 贡献与反馈
- 问题反馈: GitHub Issues
- 贡献代码: CONTRIBUTING.md
English Documentation
🎯 Key Features
- ✅ Smart Document Analysis - Auto-detect sections, handle merged cells
- ✅ Multi-format Support - Excel (.xlsx, .xls) | PDF/Word in development
- ✅ Precise Field Mapping - Field mapping table + section-level reading
- ✅ High Performance - Structured caching + lazy loading
🚀 Quick Start
Installation
macOS / Linux (Recommended with pipx)
# Install pipx
brew install pipx # macOS
# or sudo apt install pipx # Ubuntu/Debian
# Install doc-mcp-server
pipx install doc-mcp-server
Windows
pip install doc-mcp-server
For more installation options, see Full Installation Guide
Configure Claude Code
Add to ~/.claude.json or your project's config file:
{
"mcpServers": {
"document-analyzer": {
"command": "doc-mcp-server"
}
}
}
For detailed configuration, see Quick Start Guide
📚 Full Documentation
- Installation Guide - Platform-specific installation steps
- Update Guide - How to upgrade to the latest version
- Quick Start - Configuration and basic usage
- Usage Guide - Complete API and examples
- Troubleshooting - Common issues and solutions
💡 Usage Example
# 1. Analyze document structure
analyze_document(file_path="/path/to/document.xlsx")
# 2. Read specific section
read_section(file_path="/path/to/document.xlsx", section_name="Section 1")
# 3. Read single field
read_field(file_path="/path/to/document.xlsx", field_key="Section1_CompanyName")
🤝 Contributing & Feedback
- Report Issues: GitHub Issues
- Contribute Code: CONTRIBUTING.md
📄 License
MIT License - see LICENSE for details
Made with ❤️ by Yang Jiahui
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file doc_mcp_server-0.1.2.tar.gz.
File metadata
- Download URL: doc_mcp_server-0.1.2.tar.gz
- Upload date:
- Size: 21.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
be179882889fc9a5ab950956d333c1fc92a69c4363c665bc067c635c70a19387
|
|
| MD5 |
8a4c553b4e0ca99d53cec55eb916792b
|
|
| BLAKE2b-256 |
28586f034d3636a2c7c6a5a209289c82ceac4dbeccedf6b246c23f0f7ffbcc85
|
File details
Details for the file doc_mcp_server-0.1.2-py3-none-any.whl.
File metadata
- Download URL: doc_mcp_server-0.1.2-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
169f9bb0748747807ac80abd46d06fe57cf847a0d0ccbcd0de1f4c7545c1441a
|
|
| MD5 |
8f697953d6cab6387355d4925aca2047
|
|
| BLAKE2b-256 |
a459723879f428217739fe570f9b95e78435605e6d88ebfdd5b10819d27c0ae8
|