Skip to main content

A Python module to automatically detect and correct the orientation of pages in PDF documents.

Project description

PDF Orientation Corrector

Overview

The PDF Orientation Corrector is a Python module designed for automatic detection and correction of the orientation of pages in PDF documents.

Key Features

  • Automated Page Orientation Correction: Detects and corrects the orientation of text on each page of a PDF.
  • Batch Processing: Enhances performance when dealing with large documents by processing pages in batches.
  • Image Preprocessing: Uses PIL to enhance the accuracy of OCR results.

Prerequisites

Before you begin, ensure you have met the following requirements:

Python 3.x Libraries: PyPDF2, pytesseract, pdf2image, PIL (Pillow), concurrent.futures (part of the standard library)

Installation

Clone the repository or download the source code. Install the required dependencies via pip:

pip install PyPDF2 pytesseract pdf2image Pillow

## Usage

The module can be used in two ways:

As a Script

Run the script from the command line, providing the necessary arguments:

python pdf_orientation_corrector.py input.pdf output.pdf --batch_size 20 --dpi 300 --verbose

As a Library

Import the module in your Python script:

import pdf_orientation_corrector
# Use the module functions as needed

Author

James Standbridge Email: james.standbridge.git@gmail.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_orientation_corrector-0.1.4.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf_orientation_corrector-0.1.4-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file pdf_orientation_corrector-0.1.4.tar.gz.

File metadata

  • Download URL: pdf_orientation_corrector-0.1.4.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.8.10 Linux/5.15.0-88-generic

File hashes

Hashes for pdf_orientation_corrector-0.1.4.tar.gz
Algorithm Hash digest
SHA256 324b11e128a5a32c16f0b882c239c2ae31d395198bea9b00ca53bdf37586bf85
MD5 631de2c0f68a9cfdb1bfdd4c941a41a9
BLAKE2b-256 5ddce0d97e06c55f4f906455c46ea5b9a0b2b26f635474318ed58d6946432040

See more details on using hashes here.

File details

Details for the file pdf_orientation_corrector-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for pdf_orientation_corrector-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 cb7bff6bfa045204a398e8a1d3c6fba914e7d23700c3da280430e220e42e16b7
MD5 99f64a23cc210f4f0cc3a6e5f311aef5
BLAKE2b-256 ee6fa7c9f49ccfcf2d0be986204684a88f7936330a61597320dc1ea414d85a48

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page