AI-powered PDF to HTML conversion using mistral-ocr and pixtral-12b
Project description
PDF2HTML AI
A Python package for converting PDF documents to accessible HTML using Mistral OCR and Pixtral 12B. This tool processes PDFs and generates WCAG-compliant HTML output with enhanced accessibility features.
Features
- Process local PDF files or download from URLs
- OCR processing with Mistral OCR
- Generate accessible alt text for images using Pixtral 12B
- Convert to WCAG-compliant accessible HTML
- Enhance tables with proper accessibility features
- Save output as HTML file
- Option to open result in browser
Requirements
- Python 3.10+
- Mistral API key
- Required Python packages (automatically installed with pip):
- mistralai
- requests
- python-dotenv
- PyPDF2
Installation
Option 1: Install from PyPI (Recommended)
pip install pdf2html-ai
Option 2: Install from Repository
-
Clone the repository:
git clone https://github.com/mystique920/ai-powered-pdf2html cd ai-powered-pdf2html
-
Install required packages:
pip install -r requirements.txt
Quick Start
After installing the package, you can use it in your Python code or via the command line.
Using as a Python Package
from pdf2html_ai import process_pdf_with_ocr, convert_ocr_to_accessible_html
from mistralai import Mistral
# Initialize Mistral client
client = Mistral(api_key="your_api_key_here")
# Process a local PDF file
with open("document.pdf", "rb") as f:
file_content = f.read()
ocr_result = process_pdf_with_ocr(client, file_content, "document.pdf")
html_content = convert_ocr_to_accessible_html(client, ocr_result)
# Save the HTML output
with open("output.html", "w", encoding="utf-8") as f:
f.write(html_content)
Using the Command Line
Process a local PDF file:
python -m pdf2html_ai.processor --file path/to/your/document.pdf
Process a PDF from a URL:
python -m pdf2html_ai.processor --url https://example.com/document.pdf
Example Scripts
Several example scripts are provided to help you get started:
examples/example.py- An interactive example that guides you through the optionstests/test_local_pdf.py- A test script for processing a local PDF filetests/test_url_pdf.py- A test script for processing a PDF from a URL
To run the interactive example:
python examples/example.py
Command-line Options
--file,-f: Path to local PDF file--url,-u: URL to PDF file--api-key,-k: Mistral API key--output,-o: Output HTML file path (default: output.html)--max-images,-m: Maximum number of images to process (default: all)--open-browser,-b: Open the output HTML in browser after processing
Examples
Process a local file with a custom API key and open in browser:
python -m pdf2html_ai.processor --file document.pdf --api-key YOUR_API_KEY --open-browser
Process a PDF from URL and save to a custom output file:
python -m pdf2html_ai.processor --url https://example.com/document.pdf --output result.html
Process a file but limit image processing to 5 images:
python -m pdf2html_ai.processor --file document.pdf --max-images 5
API Key Setup
The script requires a valid Mistral API key to function. There are two ways to provide the API key:
-
Create a
.envfile in your project directory with the following content:MISTRAL_API_KEY=your_api_key_here -
Provide the API key directly using the
--api-keycommand-line argument:python -m pdf2html_ai.processor --file document.pdf --api-key your_api_key_here
Notes
- Processing large PDFs may take some time
- Image alt text generation uses the Pixtral 12B model
- The HTML output is designed to be WCAG-compliant for accessibility
- You can limit the number of images processed to save API usage
- The code for this application is based on a public Google Colab notebook
- The original code is from this repository: https://github.com/coldplazma/Accessible-OCR-Mistral-
- This tool was mostly modified and extended using AI tools. Use this tool at your own risk
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf2html_ai-0.1.1.tar.gz.
File metadata
- Download URL: pdf2html_ai-0.1.1.tar.gz
- Upload date:
- Size: 14.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
093fa82f532e28ca688ef04dde536f1529e9f0af11411a9a4887bba2f30c5d9d
|
|
| MD5 |
7693cdece3a91670d1a519083248b8a2
|
|
| BLAKE2b-256 |
f6828e9382be1d15044c974b7003c4b7f02613f4e0b272e77c3fbd1ece050a9d
|
File details
Details for the file pdf2html_ai-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pdf2html_ai-0.1.1-py3-none-any.whl
- Upload date:
- Size: 12.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01e2c7d9a0a05482d8e4262014755da409413841a9af7fe2c9e90f817179395f
|
|
| MD5 |
f7fa0c8979f0e026800668ef9a41c682
|
|
| BLAKE2b-256 |
eb2110848683c706bd0a58a797682578cc567917d6e68549bf3dd0858d0151f4
|