MCP server for extracting images from PDFs
Project description
PDF Image Extractor MCP Server
A Model Context Protocol (MCP) server that extracts images from PDF files. This server runs locally on your machine, allowing LLMs to access and analyze images embedded within your local PDF documents.
Features
- Local File Access: smart searching for PDFs in your current directory, Downloads, Desktop, or temp folder.
- Pagination: Efficiently handles PDFs with many images by extracting them in batches.
- Native Processing: Uses
PyMuPDFfor high-fidelity extraction.
Installation
This server is designed to be run with uv.
-
Clone the repository:
git clone https://github.com/maxrabin/pdf-image-extractor-mcp.git cd pdf-image-extractor-mcp
-
Install dependencies:
uv sync
Usage & Configuration
This server communicates via stdio (standard input/output), meaning it must be run as a local command by your MCP client.
Claude Desktop Configuration
Edit your claude_desktop_config.json (usually located at ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"pdf-image-extractor": {
"command": "uv",
"args": [
"run",
"--directory",
"/ABSOLUTE/PATH/TO/pdf-image-extractor-mcp",
"pdf-image-extractor-mcp"
]
}
}
}
Replace /ABSOLUTE/PATH/TO/ with the actual path to where you cloned this repository.
Cursor Configuration
- Open Cursor Settings.
- Navigate to Features -> MCP.
- Click + Add New MCP Server.
- Enter the following:
- Name:
pdf-image-extractor(or any name you prefer) - Type:
stdio(or Command) - Command:
uv run --directory /ABSOLUTE/PATH/TO/pdf-image-extractor-mcp pdf-image-extractor-mcp
- Name:
Testing Locally
You can verify the server works by running it directly from the command line. It should wait for input without crashing:
uv run pdf-image-extractor-mcp
(You won't see output until you send a valid JSON-RPC message, but it verifies the startup)
Development
See CONTRIBUTING.md for development instructions.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf_image_extractor_mcp-0.1.0.tar.gz.
File metadata
- Download URL: pdf_image_extractor_mcp-0.1.0.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9221274620c739f435b30e4da47c610d6666b2f4bfd8ff51c125c214dc897c9f
|
|
| MD5 |
9830c8dc57f446bba748d822a94129c8
|
|
| BLAKE2b-256 |
8bb6e561e3dc326c024bdb18c1bfae8967b6f9e8e4db6d2190e60a7a0c883652
|
Provenance
The following attestation bundles were made for pdf_image_extractor_mcp-0.1.0.tar.gz:
Publisher:
release.yml on maxrabin/pdf-image-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdf_image_extractor_mcp-0.1.0.tar.gz -
Subject digest:
9221274620c739f435b30e4da47c610d6666b2f4bfd8ff51c125c214dc897c9f - Sigstore transparency entry: 769170948
- Sigstore integration time:
-
Permalink:
maxrabin/pdf-image-mcp@0daab2d5abcecdb08f57e5251e7d4ca4d2a6d786 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/maxrabin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0daab2d5abcecdb08f57e5251e7d4ca4d2a6d786 -
Trigger Event:
release
-
Statement type:
File details
Details for the file pdf_image_extractor_mcp-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pdf_image_extractor_mcp-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5a6eb6443797f82f0dee1b4eeccaa7b6b85d89968bf3d2573c9e646e396e7d5
|
|
| MD5 |
e1f55461bb596434863c9cde5a88de46
|
|
| BLAKE2b-256 |
81f3cb28d52659df4395bcc00e1e4caa24349deafba1fd150c0063faa6e09225
|
Provenance
The following attestation bundles were made for pdf_image_extractor_mcp-0.1.0-py3-none-any.whl:
Publisher:
release.yml on maxrabin/pdf-image-mcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pdf_image_extractor_mcp-0.1.0-py3-none-any.whl -
Subject digest:
c5a6eb6443797f82f0dee1b4eeccaa7b6b85d89968bf3d2573c9e646e396e7d5 - Sigstore transparency entry: 769170952
- Sigstore integration time:
-
Permalink:
maxrabin/pdf-image-mcp@0daab2d5abcecdb08f57e5251e7d4ca4d2a6d786 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/maxrabin
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@0daab2d5abcecdb08f57e5251e7d4ca4d2a6d786 -
Trigger Event:
release
-
Statement type: