An MCP server for Upstage document parsing and information extraction
Project description
Upstage MCP Server
A Model Context Protocol (MCP) server for Upstage AI's document digitization and information extraction capabilities
📋 Overview
The Upstage MCP Server provides a bridge between AI assistants and Upstage AI's powerful document processing APIs. This server enables AI models like Claude to seamlessly extract and structure content from various document types including PDFs, images, and Office files.
✨ Key Features
- Document Digitization: Extract structured content from documents while preserving layout.
- Information Extraction: Extract specific data points based on intelligent schemas.
- Multi-format Support: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX.
- Claude Desktop Integration: Seamless integration with Claude and other MCP clients.
🔑 Prerequisites
Before using this server, you'll need:
- Upstage API Key: Obtain your API key from Upstage API
- Python 3.10+: The server requires Python 3.10 or higher.
- uv package manager: For dependency management and installation.
🚀 Local/Dev Setup Instructions
Step 1: Clone the Repository
# Clone the repository
git clone https://github.com/PritamPatil2603/upstage-mcp-server.git
# Navigate to the project directory
cd upstage-mcp-server
Step 2: Set Up Python Environment
# Install uv if not already installed
pip install uv
# Create and activate a virtual environment
uv venv
# Activate the virtual environment
# On Windows, run:
# .venv\Scripts\activate
# On macOS/Linux, run:
source .venv/bin/activate
# Install dependencies in editable mode
uv pip install -e .
Step 3: Configure Claude Desktop
-
Download Claude Desktop:
-
Open Claude Desktop:
- Navigate to Claude → Settings → Developer → Edit Config
-
Edit
claude_desktop_config.json:Add the following configuration:
For Windows:
{ "mcpServers": { "upstage-mcp-server": { "command": "uv", "args": [ "run", "--directory", "C:\\path\\to\\cloned\\upstage-mcp-server", "python", "-m", "upstage_mcp.server" ], "env": { "UPSTAGE_API_KEY": "your_api_key_here" } } } }
Replace the C:\\path\\to\\cloned\\upstage-mcp-server with the actual repository path on your system.
For macOS/Linux:
{
"mcpServers": {
"upstage-mcp-server": {
"command": "/Users/username/.local/bin/uv",
"args": [
"run",
"--directory",
"/path/to/cloned/upstage-mcp-server",
"python",
"-m",
"upstage_mcp.server"
],
"env": {
"UPSTAGE_API_KEY": "your_api_key_here"
}
}
}
}
Replace the following:
/Users/username/.local/bin/uvwith the full path to your uv executable (find it usingwhich uv)/path/to/cloned/upstage-mcp-serverwith the absolute path to your repository
Tip for macOS/Linux users: If you're experiencing connection issues, using the full path to the uv executable is often more reliable than just
uv. Find the path usingwhich uvin your terminal.
- Once above steps are completed, please restart Claude Desktop
🛠️ Available Tools
The server exposes two main tools for AI models:
-
Document Parsing (
parse_document):- Description: Processes documents and extracts their content with structure preservation.
- Parameters:
file_path: Path to the document file to be processed.
- Example Query to Claude:
Can you parse this document located at "C:\Users\username\Documents\contract.pdf" and summarize its contents?
-
Information Extraction (
extract_information):- Description: Extracts structured information from documents according to schemas.
- Parameters:
file_path: Path to the document file to process.schema_path(optional): Path to a JSON file containing the extraction schema.auto_generate_schema(default: true): Whether to automatically generate a schema.
- Example Query to Claude:
Extract the invoice number, date, and total amount from this document at "C:\Users\username\Documents\invoice.pdf".
📂 Output Files
The server saves processing results in these locations:
- Document Parsing Results:
upstage_mcp/outputs/document_parsing/ - Information Extraction Results:
upstage_mcp/outputs/information_extraction/ - Generated Schemas:
upstage_mcp/outputs/information_extraction/schemas/
🔧 Troubleshooting
Common Issues
-
API Key Not Found:
Ensure your Upstage API key is correctly set in environment variables or the.envfile. -
File Not Found:
Verify that the file path is correct and accessible to the server. -
Server Not Starting:
Check if you've activated the virtual environment and installed all dependencies.
Checking Logs
Claude Desktop logs can be found at:
- Windows:
%APPDATA%\Claude\logs\mcp-server-upstage-mcp-server.log - macOS:
~/Library/Logs/Claude/mcp-server-upstage-mcp-server.log
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request to enhance the project or add new features.
📄 License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file upstage_mcp_server-0.1.0.tar.gz.
File metadata
- Download URL: upstage_mcp_server-0.1.0.tar.gz
- Upload date:
- Size: 5.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e80a76e5238ea3e38767f373578f14d343b053a7c7bf09e9bc5947b5aa846a9d
|
|
| MD5 |
c0fa0272d6b39026b47a0135b1c74c06
|
|
| BLAKE2b-256 |
bfa7e73bcbe52836f1ef2dc12961ea5ed98d5b8b2c40b3ce94258be1c82a81da
|
File details
Details for the file upstage_mcp_server-0.1.0-py3-none-any.whl.
File metadata
- Download URL: upstage_mcp_server-0.1.0-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5001be3c7b3aaa9cbd77fa8475b299c868033745b1de8b96bcc592799b4be00
|
|
| MD5 |
28a727fa9ffce0bf32379f0c71f1ffd3
|
|
| BLAKE2b-256 |
6904d5c41e6cc8f73257b2ef8af9ddeb74720a95b4b77b55177996a165ccdc5e
|