Lizeur is a MCP server to be able to get content from PDFs.
Project description
Lizeur - PDF Content Extraction MCP Server
Lizeur is a Model Context Protocol (MCP) server that enables AI assistants to extract and read content from PDF documents using Mistral AI's OCR capabilities. It provides a simple interface for converting PDF files to markdown text that can be easily consumed by AI models.
Features
- PDF OCR Processing: Uses Mistral AI's latest OCR model to extract text from PDF documents
- Intelligent Caching: Automatically caches processed documents to avoid re-processing
- Markdown Output: Returns clean markdown text for easy integration with AI workflows
- FastMCP Integration: Built with FastMCP for optimal performance and ease of use
Prerequisites
- Python 3.10
- UV package manager
- Mistral AI API key
Installation
From pypi
pip install lizeur
Manual
1. Clone the Repository
git clone https://github.com/SilverBzH/lizeur
cd lizeur
2. Create and Activate Virtual Environment
# Create a virtual environment
uv venv --python 3.10
# Activate the virtual environment
# On macOS/Linux:
source .venv/bin/activate
# On Windows:
# .venv\Scripts\activate
3. Install Dependencies and Build
# Install dependencies
uv sync
# Build the package
uv build
4. Install System-Wide
# Install the package system-wide
uv pip install --system .
This will install the lizeur command globally on your system.
MCP Configuration
Add the following configuration to your mcp.json file:
{
"mcpServers": {
"lizeur": {
"command": "lizeur",
"env": {
"MISTRAL_API_KEY": "your-mistral-api-key-here",
"CACHE_PATH": "your cache path",
}
}
}
}
Usage
Once configured, the MCP server provides a read_pdf tool that can be used by AI assistants:
- Function:
read_pdf - Parameter:
absolute_path(string) - The absolute path to the PDF file - Returns: Markdown text extracted from the first page of the PDF
Example Usage in AI Assistant
The AI assistant can now use the tool like this:
What the OP code looks like for this specific controller, here is the doc /path/to/document.pdf
The MCP server will:
- Check if the document is already cached
- If not cached, upload the PDF to Mistral AI for OCR processing This will use your MISTRAL API key and cost money
- Extract the text and convert it to markdown
- Cache the result for future use
- Return the markdown content
Development
Local Development Setup
# Install in development mode
uv pip install -e .
# Run the server directly
python main.py
Project Structure
main.py- Main server implementation with FastMCP integrationpyproject.toml- Project configuration and dependenciesuv.lock- Locked dependency versions
Dependencies
mcp[cli]>=1.12.4- Model Context Protocol implementationmistralai>=0.0.10- Mistral AI Python client
License
This project is licensed under the MIT License.
Support
For issues and questions, please refer to the project repository or contact the maintainers.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lizeur-0.1.1.tar.gz.
File metadata
- Download URL: lizeur-0.1.1.tar.gz
- Upload date:
- Size: 36.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1d3fce4c5ca3f8e02fcc1843cf1823f4923e8ac046a53e178a2f42229a45344
|
|
| MD5 |
c44fd707f396969ca163f8666a035c7e
|
|
| BLAKE2b-256 |
56ceeb75c829e588db96f45bacfca4e2f867870d49df2208a7b0d9f6a9301e3c
|
File details
Details for the file lizeur-0.1.1-py3-none-any.whl.
File metadata
- Download URL: lizeur-0.1.1-py3-none-any.whl
- Upload date:
- Size: 42.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9d3d6e513dcbf07ac3c1bbc6377dbb97fdf723b9506fd5a9292c7dd1cb51212
|
|
| MD5 |
a9e5fee4cf11436015e89b9cb7d10046
|
|
| BLAKE2b-256 |
86e32293960f039894896448a44e87f31e1fb911fa4c6eff92d56adafe818ea8
|