Skip to main content

An MCP enabled multi-format document reader supporting DOCX, PDF, TXT, and Excel files

Project description

MCP Document Reader

CSDN Blog GitHub Repository Gitee Repository GitHub License Python Version

MCP (Model Context Protocol) Document Reader - A powerful MCP tool for reading documents in multiple formats, enabling AI agents to truly "read" your documents.

GitHub Repository: https://github.com/xt765/mcp_documents_reader Gitee Repository: https://gitee.com/xt765/mcp_documents_reader

Features

  • Multi-format Support: Supports 4 mainstream document formats: Excel (XLSX/XLS), DOCX, PDF, and TXT
  • MCP Protocol: Compliant with MCP standards, can be used as a tool for AI assistants like Trae IDE
  • Easy Integration: Simple configuration for immediate use
  • Reliable Performance: Successfully tested and running in Trae IDE
  • File System Support: Reads documents directly from the file system

Supported Formats

Format Extensions MIME Type Features
Excel .xlsx, .xls application/vnd.openxmlformats-officedocument.spreadsheetml.sheet Sheet and cell data extraction
DOCX .docx application/vnd.openxmlformats-officedocument.wordprocessingml.document Text and structure extraction
PDF .pdf application/pdf Text extraction
Text .txt text/plain Plain text reading

Installation

Prerequisites

  • Python 3.8 or higher
  • MCP-enabled AI tool such as Trae IDE

Installation Steps

# Clone the repository
git clone https://github.com/xt765/mcp_documents_reader.git
cd mcp_documents_reader

# Install dependencies
pip install -e .

Configuration

Using in Trae IDE

Add the following to your Trae IDE's MCP configuration:

Option 1: Using GitHub repository (Recommended)

{
  "mcpServers": {
    "mcp-document-reader": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://github.com/xt765/mcp_documents_reader",
        "mcp_documents_reader"
      ]
    }
  }
}

Option 2: Using Gitee repository

{
  "mcpServers": {
    "mcp-document-reader": {
      "command": "uvx",
      "args": [
        "--from",
        "git+https://gitee.com/xt765/mcp_documents_reader",
        "mcp_documents_reader"
      ]
    }
  }
}

Environment Variables

  • DOCUMENT_DIRECTORY - Directory where documents are stored (default: "./documents")

Usage

As an MCP Tool

After configuration, AI assistants can directly call the following tool:

read_document (Recommended)

Read any supported document type with a unified interface.

read_document(filename="example.docx")
read_document(filename="example.pdf")
read_document(filename="example.xlsx")
read_document(filename="example.txt")

Tool Interface Details

read_document

Read any supported document type.

Parameters:

Parameter Type Required Description
filename string Document file path, supports absolute or relative paths

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_documents_reader-1.0.0.tar.gz (7.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcp_documents_reader-1.0.0-py3-none-any.whl (5.9 kB view details)

Uploaded Python 3

File details

Details for the file mcp_documents_reader-1.0.0.tar.gz.

File metadata

  • Download URL: mcp_documents_reader-1.0.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for mcp_documents_reader-1.0.0.tar.gz
Algorithm Hash digest
SHA256 28f76afd91a1616531812008960a80b5a257e83c161bd98c5460cd067d2e4761
MD5 e9638e3cfff956a173a2a6bf17389710
BLAKE2b-256 62eeb52951663b601ad6541ed5dfdc0a356cbe3da7e9cd601443842cd091a146

See more details on using hashes here.

File details

Details for the file mcp_documents_reader-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mcp_documents_reader-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e3eab8355284b2c062d5b55c8008b71db0e2867c8ec4a0ae7940110bfde68dc3
MD5 775cef177c56e503d2f13124b21bc419
BLAKE2b-256 ce778f93d0763e4a56d60c497c1d4c596caaeec8d3d12fd0787e00b90b2aa6a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page