A GUI tool for deskewing scanned PDF documents using PyQt6 and OpenCV
Project description
PDF Deskew Tool
Overview
PDF Deskew Tool is a graphical user interface (GUI) application designed to correct skewed pages in scanned PDF documents. It leverages PyMuPDF, OpenCV, and other powerful libraries to process each page of a PDF and generate a corrected version with improved readability and visual balance. The tool supports multi-language interfaces, theme switching, file drag-and-drop, and detailed progress feedback, aiming to provide a simple and efficient user experience.
Features
- Multi-language Support: Supports both Chinese and English interfaces with easy language switching.
- Drag-and-Drop File Selection: Simply drag and drop your PDF files for easy selection.
- Batch Processing: Process multiple PDF files simultaneously to improve work efficiency.
- Real-time Progress Feedback: Display progress bars and percentages to track processing status.
- Theme Switching: Offers multiple interface themes for personalized appearance.
- Customizable Settings:
- DPI Configuration: Customize rendering DPI to meet different quality requirements.
- Background Color Selection: Choose or customize background colors to optimize correction results.
- Image Enhancement: Remove watermarks, enhance contrast, denoise, and sharpen images.
- Logging: Records important information and errors during processing for debugging and user feedback.
- Intuitive Interface: User-friendly design with icons and tooltips for enhanced usability.
Installation
Recommended: Using uv
uv tool install pdf-deskew
This will automatically create two executable commands: pdf-deskew (GUI) and pdf-deskew-cli (CLI).
Alternative: Using pip
pip install pdf-deskew
From Source (Development)
git clone https://github.com/tinnci/pdf_deskew.git
cd pdf_deskew
# Create virtual environment
uv venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install in development mode
uv pip install -e .
Dependencies
The tool automatically installs the following dependencies:
- PyQt6 (>=6.7.1): GUI framework
- PyMuPDF (>=1.24.13): PDF processing
- OpenCV (>=4.10.0.84): Image processing
- Pillow (>=11.0.0): Image manipulation
- numpy (>=2.1.2): Numerical computing
- deskew (>=1.5.1): Skew detection
- qt-material (>=2.14): Theme support
- tqdm (>=4.66.6): Progress bars
Usage
GUI Application
Start the application:
pdf-deskew
Interface Guide:
-
File Selection:
- Input PDF: Click "Browse" button or drag-and-drop a PDF file
- Output PDF: Specify save location (default:
input_filename_deskewed.pdf)
-
Processing Options:
- Use Recommended Settings: DPI=300, white background
- Custom Settings: Adjust DPI, background color, watermark removal, image enhancement
- Image Processing:
- Remove watermarks (Inpainting)
- Enhance images (contrast, denoising, sharpening)
- Convert to grayscale
-
Language & Theme:
- Switch between English and Chinese
- Choose from multiple interface themes
Command-Line Tool
View help:
pdf-deskew-cli --help
Basic usage:
# Simple conversion
pdf-deskew-cli input.pdf
# Specify output
pdf-deskew-cli input.pdf -o output.pdf
# Custom DPI
pdf-deskew-cli input.pdf -d 600
# With enhancements
pdf-deskew-cli input.pdf --enhance --remove-watermark
# Change background
pdf-deskew-cli input.pdf --bg-color black
Command-line Arguments:
input: Input PDF file path (required)-o, --output: Output file path (default:input_deskewed.pdf)-d, --dpi: Rendering DPI, range 72-1200 (default: 300)--bg-color: Background color, white or black (default: white)--enhance: Enable image enhancement--remove-watermark: Enable watermark removal-v, --version: Show version number
System Requirements
- Operating System: Windows, macOS, or Linux
- Python: 3.12 or higher
- Optional: uv package manager (recommended)
Notes
- Special Characters in Paths: If your file paths contain spaces or special characters, use quotes to avoid errors.
- Temporary Files: The application creates a temporary folder for intermediate images, which is automatically cleaned up after processing.
- Logging: Processing logs are recorded in
pdf_deskew.logfor debugging purposes. - Theme Switching: Theme changes take effect immediately without requiring application restart.
Development
To contribute to this project:
-
Clone the Repository:
git clone https://github.com/tinnci/pdf_deskew.git cd pdf_deskew
-
Set Up Environment:
uv venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate uv pip install -e .
-
Run Tests:
pytest
-
Submit Changes:
git add . git commit -m "Description of changes" git push origin your-branch
License
This project is licensed under the MIT License. You are free to use and modify it.
Support
For issues or questions:
- GitHub Issues: https://github.com/tinnci/pdf_deskew/issues
- Email: luoyido@outlook.com
Thank you for using PDF Deskew Tool! If you find it useful, please give us a ⭐ on GitHub and share it with others who might benefit.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf_deskew-0.1.1.tar.gz.
File metadata
- Download URL: pdf_deskew-0.1.1.tar.gz
- Upload date:
- Size: 24.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9824d18142a8cc246bc711ecd3ee0b7a523d66abcf8f5ade84a879afb1970da
|
|
| MD5 |
7dd41020df54fb48a026864df4bdc023
|
|
| BLAKE2b-256 |
78ee9841ec1fa6089044ef2dac420429dbc5afaae6d64aec65d09300474b54bd
|
File details
Details for the file pdf_deskew-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pdf_deskew-0.1.1-py3-none-any.whl
- Upload date:
- Size: 23.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a32ee0baf17f67ad89f805b224f7261892060e0d0fe076c61afc5c02d84eac73
|
|
| MD5 |
54ab6b3fed39c343f96a2e822c426f55
|
|
| BLAKE2b-256 |
1f8f3a0791d2331c0d3f019068d514c238ae1d5c9e19f4ac01916e81baa95c58
|