Convert handwritten PDF notes to text using OCR and LLM
Project description
Hand2Text
A Python package that converts handwritten PDF notes to text using OCR and AI.
Overview
Hand2Text helps you convert your handwritten PDF notes into editable text. It's designed for students, researchers, or anyone who takes handwritten notes and wants to digitize them.
The process is straightforward:
- PDF to Images: Breaks down your PDF into individual page images
- Text Extraction: Uses AI to read your handwriting and convert it to text
- Vision AI First: OpenAI's latest models can read handwriting directly from images
- OCR Backup: Falls back to traditional OCR + AI cleanup if needed
Installation
Quick Install
pip install hand2text
Prerequisites
- Python 3.10 or higher
- Tesseract OCR (required for fallback method)
- OpenAI API key
Setup
-
Install the package:
pip install hand2text
-
Get an OpenAI API key from https://platform.openai.com/api-keys
-
Create a
.envfile in your working directory:OPENAI_API_KEY=your_key_here TESSERACT_PATH=C:\Program Files\Tesseract-OCR\tesseract.exe # Only needed on Windows for OCR fallback
Usage
Command Line
Processing a PDF is as simple as:
hand2text path/to/your/notes.pdf
This creates a text folder with your converted notes. The images are cleaned up automatically, so you just get the text files you care about.
Python API
If you want to integrate this into your own code:
from hand2text import main
# Process with default output folders
main("path/to/your/notes.pdf")
# Or specify where you want the output
main("notes.pdf", "temp_images", "my_text_output")
How It Works
PDF to Image Conversion
Uses PyMuPDF to convert each page of the PDF to a PNG image.
Text Extraction
Hand2Text tries to be smart about extracting your handwritten text:
Primary Method: Vision AI
First, it sends your handwritten pages directly to OpenAI's vision models (like GPT-4o). These models have gotten surprisingly good at reading handwriting - often better than traditional OCR.
Fallback Method: OCR + AI Cleanup
If the vision models aren't available or fail, Hand2Text falls back to:
- OCR: Uses Tesseract to scan the text (with some image preprocessing to help it out)
- AI Cleanup: Sends the messy OCR output to GPT-3.5 to fix obvious mistakes and clean things up
Example Output
$ hand2text lecture_notes.pdf
[MAIN] Starting pipeline with lecture_notes.pdf -> lecture_notes_images -> lecture_notes_text
[MAIN] Finished PDF to images. Listing images...
[MAIN] Found images: ['page_1.png', 'page_2.png', 'page_3.png']
[VISION] Trying model: gpt-4o
[VISION] Successfully used model: gpt-4o
[MAIN] Saved transcribed text to lecture_notes_text/page_1.txt
...
[COMBINE] Combined 3 text files into lecture_notes_text/lecture_notes_combined.txt
What You Need
- OpenAI API Key: This does the heavy lifting for reading your handwriting
- Tesseract OCR: Optional backup if you want the OCR fallback (most people won't need this)
- Python 3.10+: Any recent Python version will work
Development
Setting up for development
- Clone the repository
- Install dependencies:
poetry install - Install pre-commit hooks:
poetry run pre-commit install
Code quality
This project uses modern Python tooling:
- Ruff: Fast linting and formatting
- MyPy: Type checking
- Pre-commit: Automatic checks before commits
Run checks manually:
poetry run ruff check hand2text/ # Linting
poetry run ruff format hand2text/ # Formatting
poetry run mypy hand2text/ # Type checking
Contributing
- Fork the repository
- Create a feature branch
- Make your changes (linting runs automatically on commit)
- Submit a pull request
License
MIT License - see LICENSE file for details.
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hand2text-0.1.2.tar.gz.
File metadata
- Download URL: hand2text-0.1.2.tar.gz
- Upload date:
- Size: 14.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Windows/11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c2ab5681eac93d5e4449e5abd9781e592104c3f48c0b8412ecf627b783c1dc7
|
|
| MD5 |
e5e127f523d0c2ef0ed559a048f9070c
|
|
| BLAKE2b-256 |
68ef6f1ecb1429181bc705fc4cffa47a4679f5145901137c503ac99b33cd0dcc
|
File details
Details for the file hand2text-0.1.2-py3-none-any.whl.
File metadata
- Download URL: hand2text-0.1.2-py3-none-any.whl
- Upload date:
- Size: 11.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.3 Windows/11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b1e96cec4a5844e7aabc4d8304bd413d4398cd125cb34d9462eea439dc5945b
|
|
| MD5 |
a9bb1b0549cf20ac3fcf57705fb77a34
|
|
| BLAKE2b-256 |
8780626706297956098417757a1fa8e6f916059a43e72d7ec5383eef591a414f
|