Skip to main content

Convert handwritten PDF notes to text using OCR and LLM

Project description

Hand2Text

A Python package that converts handwritten PDF notes to text using OCR and AI.

PyPI version Python 3.10+

Overview

Hand2Text helps you convert your handwritten PDF notes into editable text. It's designed for students, researchers, or anyone who takes handwritten notes and wants to digitize them.

The process is straightforward:

  1. PDF to Images: Breaks down your PDF into individual page images
  2. Text Extraction: Uses AI to read your handwriting and convert it to text
    • Vision AI First: OpenAI's latest models can read handwriting directly from images
    • OCR Backup: Falls back to traditional OCR + AI cleanup if needed

Installation

Quick Install

pip install hand2text

Prerequisites

  • Python 3.10 or higher
  • Tesseract OCR (required for fallback method)
  • OpenAI API key

Setup

  1. Install the package:

    pip install hand2text
    
  2. Get an OpenAI API key from https://platform.openai.com/api-keys

  3. Create a .env file in your working directory:

    OPENAI_API_KEY=your_key_here
    TESSERACT_PATH=C:\Program Files\Tesseract-OCR\tesseract.exe  # Only needed on Windows for OCR fallback
    

Usage

Command Line

Processing a PDF is as simple as:

hand2text path/to/your/notes.pdf

This creates a text folder with your converted notes. The images are cleaned up automatically, so you just get the text files you care about.

Python API

If you want to integrate this into your own code:

from hand2text import main

# Process with default output folders
main("path/to/your/notes.pdf")

# Or specify where you want the output
main("notes.pdf", "temp_images", "my_text_output")

How It Works

PDF to Image Conversion

Uses PyMuPDF to convert each page of the PDF to a PNG image.

Text Extraction

Hand2Text tries to be smart about extracting your handwritten text:

Primary Method: Vision AI

First, it sends your handwritten pages directly to OpenAI's vision models (like GPT-4o). These models have gotten surprisingly good at reading handwriting - often better than traditional OCR.

Fallback Method: OCR + AI Cleanup

If the vision models aren't available or fail, Hand2Text falls back to:

  1. OCR: Uses Tesseract to scan the text (with some image preprocessing to help it out)
  2. AI Cleanup: Sends the messy OCR output to GPT-3.5 to fix obvious mistakes and clean things up

Example Output

$ hand2text lecture_notes.pdf
[MAIN] Starting pipeline with lecture_notes.pdf -> lecture_notes_images -> lecture_notes_text
[MAIN] Finished PDF to images. Listing images...
[MAIN] Found images: ['page_1.png', 'page_2.png', 'page_3.png']
[VISION] Trying model: gpt-4o
[VISION] Successfully used model: gpt-4o
[MAIN] Saved transcribed text to lecture_notes_text/page_1.txt
...
[COMBINE] Combined 3 text files into lecture_notes_text/lecture_notes_combined.txt

What You Need

  • OpenAI API Key: This does the heavy lifting for reading your handwriting
  • Tesseract OCR: Optional backup if you want the OCR fallback (most people won't need this)
  • Python 3.10+: Any recent Python version will work

Development

Setting up for development

  1. Clone the repository
  2. Install dependencies:
    poetry install
    
  3. Install pre-commit hooks:
    poetry run pre-commit install
    

Code quality

This project uses modern Python tooling:

  • Ruff: Fast linting and formatting
  • MyPy: Type checking
  • Pre-commit: Automatic checks before commits

Run checks manually:

poetry run ruff check hand2text/     # Linting
poetry run ruff format hand2text/    # Formatting
poetry run mypy hand2text/           # Type checking

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes (linting runs automatically on commit)
  4. Submit a pull request

License

MIT License - see LICENSE file for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hand2text-0.1.2.tar.gz (14.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hand2text-0.1.2-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file hand2text-0.1.2.tar.gz.

File metadata

  • Download URL: hand2text-0.1.2.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Windows/11

File hashes

Hashes for hand2text-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4c2ab5681eac93d5e4449e5abd9781e592104c3f48c0b8412ecf627b783c1dc7
MD5 e5e127f523d0c2ef0ed559a048f9070c
BLAKE2b-256 68ef6f1ecb1429181bc705fc4cffa47a4679f5145901137c503ac99b33cd0dcc

See more details on using hashes here.

File details

Details for the file hand2text-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: hand2text-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 11.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.12.3 Windows/11

File hashes

Hashes for hand2text-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9b1e96cec4a5844e7aabc4d8304bd413d4398cd125cb34d9462eea439dc5945b
MD5 a9bb1b0549cf20ac3fcf57705fb77a34
BLAKE2b-256 8780626706297956098417757a1fa8e6f916059a43e72d7ec5383eef591a414f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page