A Python library for dewarping/straightening/reformatting document images and PDFs
Project description
py-reform: PDF & Image Dewarping Library
A Python library for dewarping/straightening/reformatting document images and PDFs.
Features
- Dewarp/straighten single images
- Process entire PDFs or selected pages
- Return PIL images for further processing
- Save results as images or PDFs
- Progress tracking with tqdm
- Flexible error handling
- Automatic EXIF orientation handling
- Multiple dewarping models
Installation
pip install py-reform
Quick Start
Process a Single Image
from py_reform import straighten
# Process a single image
straight_image = straighten("curved_page.jpg")
straight_image.save("straight_page.jpg")
Process a PDF
from py_reform import straighten, save_pdf
# Process a PDF (all pages)
straight_pages = straighten("document.pdf")
# Save processed pages as a new PDF
save_pdf(straight_pages, "straight_document.pdf")
Process Specific PDF Pages
# Process specific PDF pages
straight_pages = straighten("document.pdf", pages=[0, 2, 5])
Choose a Different Dewarping Model
By default we use UVDoc, which works for all sorts of problematic images. If you just need to rotate the image, though, use deskew instead.
# Use the rotation-based deskew model
straight_image = straighten("document.jpg", model="deskew")
# Use the UVDoc model with custom parameters
straight_image = straighten("document.jpg", model="uvdoc", device="cpu")
# Configure deskew model parameters
straight_image = straighten("document.jpg", model="deskew", max_angle=15.0, num_peaks=30)
Create Before/After Comparisons
from py_reform.utils import create_comparison
straight_image = straighten("curved_page.jpg")
# Create a side-by-side comparison
comparison = create_comparison(["curved_page.jpg", straight_image])
comparison.save("comparison.jpg")
Error Handling
# Default: stop on error
result = straighten("document.pdf", errors="raise")
# Skip errors, log warning
result = straighten("document.pdf", errors="ignore")
# Use original on error with warning
result = straighten("document.pdf", errors="warn")
Working with Image Orientation
The library automatically handles EXIF orientation data in JPEG files, ensuring that images are correctly oriented before processing. You can also use these utilities directly:
from py_reform.utils import open_image, auto_rotate_image
import PIL.Image
# Open an image with automatic orientation correction
img = open_image("photo.jpg")
# Or correct orientation of an already opened image
img = PIL.Image.open("photo.jpg")
img = auto_rotate_image(img)
Available Models
Examples
Citation
The UVDoc model is based on original work by Floor Verhoeven, Tanguy Magne, and Olga Sorkine-Hornung. If you use py-reform with the UVDoc model, please consider citing their work:
@inproceedings{UVDoc,
title={{UVDoc}: Neural Grid-based Document Unwarping},
author={Floor Verhoeven and Tanguy Magne and Olga Sorkine-Hornung},
booktitle = {SIGGRAPH ASIA, Technical Papers},
year = {2023},
url={https://doi.org/10.1145/3610548.3618174}
}
Original UVDoc repository: https://github.com/tanguymagne/UVDoc/
Anything else??
I'm pretty sure I wrote about two lines of code for this, the rest was all Cursor and Claude 3.7 Sonnet. My job was mostly making demands around pathlib and ditching OpenCV.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file py_reform-0.1.3.tar.gz.
File metadata
- Download URL: py_reform-0.1.3.tar.gz
- Upload date:
- Size: 54.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c3629b2af03d62f213f82b3a239cfad0709ca1c1107589d6967b2ae93a16dc5
|
|
| MD5 |
8945c13e29c1e3a5d8d89e755aaae423
|
|
| BLAKE2b-256 |
40006040551d90c942dabcb5055d49c2d4fa20f5fe0b7d679b0946fd3ff0f67a
|
File details
Details for the file py_reform-0.1.3-py3-none-any.whl.
File metadata
- Download URL: py_reform-0.1.3-py3-none-any.whl
- Upload date:
- Size: 54.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
baae49d3f1ca1f9a3424e79ec2cdca1007b0112de7b51db1f6dd118ea6d14f49
|
|
| MD5 |
dd04491b0ed239249ee2d35248b59377
|
|
| BLAKE2b-256 |
64755aa65d016db6271d9d55975b035a178fccc83042088e5bafb9b4c53d0a43
|