AI-powered PDF to HTML conversion using mistral-ocr and pixtral-12b

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Mistral OCR Processor

A command-line tool for processing PDF documents with Mistral OCR and generating accessible HTML output. This tool is adapted from a Google Colab notebook to run locally on your machine.

Quick Start

Several scripts are provided to help you get started:

mistral_ocr.py - The main script for processing PDFs
example.py - An interactive example that guides you through the options
test_local_pdf.py - A test script for processing a local PDF file
test_url_pdf.py - A test script for processing a PDF from a URL

To run the interactive example:

python example.py

To test with a local PDF:

python test_local_pdf.py

To test with a PDF from a URL:

python test_url_pdf.py

Features

Process local PDF files or download from URLs
OCR processing with Mistral OCR
Generate accessible alt text for images using Pixtral 12B
Convert to WCAG-compliant accessible HTML
Enhance tables with proper accessibility features
Save output as HTML file
Option to open result in browser

Requirements

Python 3.6+
Mistral API key
Required Python packages:
- mistralai
- requests

Installation

Clone the repository:

git clone https://github.com/mystique920/ai-powered-pdf2html
cd ai-powered-pdf2html

Ensure you have Python installed
Install required packages using pip:
```
pip install -r requirements.txt
```
Or install packages individually:
```
pip install mistralai requests
```
Make the script executable (optional):
```
chmod +x mistral_ocr.py
```

Usage

Basic Usage

Process a local PDF file:

python mistral_ocr.py --file path/to/your/document.pdf

Process a PDF from a URL:

python mistral_ocr.py --url https://example.com/document.pdf

Command-line Options

--file, -f: Path to local PDF file
--url, -u: URL to PDF file
--api-key, -k: Mistral API key
--output, -o: Output HTML file path (default: output.html)
--max-images, -m: Maximum number of images to process (default: all)
--open-browser, -b: Open the output HTML in browser after processing

Examples

Process a local file with a custom API key and open in browser:

python mistral_ocr.py --file document.pdf --api-key YOUR_API_KEY --open-browser

Process a PDF from URL and save to a custom output file:

python mistral_ocr.py --url https://example.com/document.pdf --output result.html

Process a file but limit image processing to 5 images:

python mistral_ocr.py --file document.pdf --max-images 5

API Key Setup

The script requires a valid Mistral API key to function. There are two ways to provide the API key:

Create a .env file in the same directory as the script with the following content:
```
MISTRAL_API_KEY=your_api_key_here
```

Provide the API key directly using the --api-key command-line argument:

python mistral_ocr.py --file document.pdf --api-key your_api_key_here

Notes

Processing large PDFs may take some time
Image alt text generation uses the Pixtral 12B model
The HTML output is designed to be WCAG-compliant for accessibility
You can limit the number of images processed to save API usage
The code for this application is based on a public Google Colab notebook
The original code is from this repository: https://github.com/coldplazma/Accessible-OCR-Mistral-

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.3

Mar 14, 2025

0.1.1

Mar 14, 2025

This version

0.1.0

Mar 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf2html_ai-0.1.0.tar.gz (13.5 kB view details)

Uploaded Mar 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdf2html_ai-0.1.0-py3-none-any.whl (12.3 kB view details)

Uploaded Mar 14, 2025 Python 3

File details

Details for the file pdf2html_ai-0.1.0.tar.gz.

File metadata

Download URL: pdf2html_ai-0.1.0.tar.gz
Upload date: Mar 14, 2025
Size: 13.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pdf2html_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2f901077a88d994144bc1a20d0786aee7a021a5f18ece1d6c84d356e4ff89fd3`
MD5	`44a07e0b04b81f0812751c4b2413796b`
BLAKE2b-256	`b19f7e55d16b226d26b4e62b06d5b443579b146daec4c869647f52abfe80dc54`

See more details on using hashes here.

File details

Details for the file pdf2html_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: pdf2html_ai-0.1.0-py3-none-any.whl
Upload date: Mar 14, 2025
Size: 12.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pdf2html_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a38561e428e441cc02c0d9474018273ded7e3192f10c1d7fc1cc21ea25157f61`
MD5	`d937f130be10845b044586fb1fb0efb7`
BLAKE2b-256	`1be3f89c9422e226c30f63175f34fd0e3086f3af7abcf1f866ccc03399c352f7`

See more details on using hashes here.

pdf2html-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Mistral OCR Processor

Quick Start

Features

Requirements

Installation

Usage

Basic Usage

Command-line Options

Examples

API Key Setup

Notes

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes