A Python module to automatically detect and correct the orientation of pages in PDF documents.
Project description
PDF Orientation Corrector
Overview
The PDF Orientation Corrector is a Python module designed for automatic detection and correction of the orientation of pages in PDF documents.
Key Features
- Automated Page Orientation Correction: Detects and corrects the orientation of text on each page of a PDF.
- Batch Processing: Enhances performance when dealing with large documents by processing pages in batches.
- Image Preprocessing: Uses PIL to enhance the accuracy of OCR results.
Prerequisites
Before you begin, ensure you have met the following requirements:
Python 3.x Libraries: PyPDF2, pytesseract, pdf2image, PIL (Pillow), concurrent.futures (part of the standard library)
Installation
Clone the repository or download the source code. Install the required dependencies via pip:
pip install PyPDF2 pytesseract pdf2image Pillow
## Usage
The module can be used in two ways:
As a Script
Run the script from the command line, providing the necessary arguments:
python pdf_orientation_corrector.py input.pdf output.pdf --batch_size 20 --dpi 300 --verbose
As a Library
Import the module in your Python script:
import pdf_orientation_corrector
# Use the module functions as needed
Author
James Standbridge Email: james.standbridge.git@gmail.com
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf_orientation_corrector-0.1.4.tar.gz.
File metadata
- Download URL: pdf_orientation_corrector-0.1.4.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.8.10 Linux/5.15.0-88-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
324b11e128a5a32c16f0b882c239c2ae31d395198bea9b00ca53bdf37586bf85
|
|
| MD5 |
631de2c0f68a9cfdb1bfdd4c941a41a9
|
|
| BLAKE2b-256 |
5ddce0d97e06c55f4f906455c46ea5b9a0b2b26f635474318ed58d6946432040
|
File details
Details for the file pdf_orientation_corrector-0.1.4-py3-none-any.whl.
File metadata
- Download URL: pdf_orientation_corrector-0.1.4-py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.8.10 Linux/5.15.0-88-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb7bff6bfa045204a398e8a1d3c6fba914e7d23700c3da280430e220e42e16b7
|
|
| MD5 |
99f64a23cc210f4f0cc3a6e5f311aef5
|
|
| BLAKE2b-256 |
ee6fa7c9f49ccfcf2d0be986204684a88f7936330a61597320dc1ea414d85a48
|