Skip to main content

A GUI tool for deskewing scanned PDF documents using PyQt6 and OpenCV

Project description

PDF Deskew Tool

PyPI version License: MIT CI

English | 简体中文


English

Overview

PDF Deskew Tool is a powerful GUI and CLI application designed to automatically detect and correct skewed pages in scanned PDF documents. By leveraging PyMuPDF and OpenCV, it provides high-quality image processing to ensure your documents are perfectly aligned and highly readable.

Key Features

  • Smart Deskewing: Automatically detects and corrects rotation angles.
  • Batch Processing: Handle multiple PDF files simultaneously.
  • Image Enhancement:
    • Watermark Removal: Advanced inpainting to clean up documents.
    • Quality Boost: Contrast enhancement, denoising, and sharpening.
    • Grayscale Conversion: Reduce file size and improve clarity.
  • User Friendly:
    • Modern GUI: Built with PyQt6 and Material Design themes.
    • Drag & Drop: Easy file selection.
    • Bilingual: Full support for English and Chinese.
  • Flexible CLI: Robust command-line interface for automation and power users.

Installation

Recommended: Using uv

uv tool install pdf-deskew

Using pip

pip install pdf-deskew

Usage

GUI Application

Simply run the command to open the interface:

pdf-deskew

Command Line Interface (CLI)

# Basic usage (output saved as input_deskewed.pdf)
pdf-deskew-cli input.pdf

# Custom output and DPI
pdf-deskew-cli input.pdf -o output.pdf -d 600

# Enable all enhancements
pdf-deskew-cli input.pdf --enhance --remove-watermark

简体中文

概述

PDF 倾斜校正工具 (PDF Deskew Tool) 是一款功能强大的图形界面 (GUI) 和命令行 (CLI) 应用程序,专门用于自动检测并纠正扫描 PDF 文档中的页面倾斜。通过结合 PyMuPDF 和 OpenCV 的强大功能,它能提供高质量的图像处理,确保您的文档排列整齐、清晰易读。

核心功能

  • 智能纠偏:自动检测并修正页面旋转角度。
  • 批量处理:支持同时处理多个 PDF 文件,提高效率。
  • 图像增强
    • 去除水印:使用先进的修复算法清理文档背景。
    • 画质提升:对比度增强、降噪及锐化处理。
    • 灰度转换:减小文件体积并提升文字清晰度。
  • 用户友好
    • 现代界面:基于 PyQt6 和 Material Design 主题构建。
    • 拖放支持:支持直接拖入文件进行处理。
    • 双语支持:完整的中英文界面切换。
  • 灵活的命令行:为高级用户和自动化脚本提供强大的 CLI 支持。

安装方法

推荐方式:使用 uv

uv tool install pdf-deskew

使用 pip

pip install pdf-deskew

使用说明

图形界面 (GUI)

直接运行以下命令启动程序:

pdf-deskew

命令行界面 (CLI)

# 基本用法(输出默认为 input_deskewed.pdf)
pdf-deskew-cli input.pdf

# 指定输出路径和 DPI
pdf-deskew-cli input.pdf -o output.pdf -d 600

# 开启所有增强功能
pdf-deskew-cli input.pdf --enhance --remove-watermark

System Requirements / 系统要求

  • OS: Windows, macOS, or Linux
  • Python: 3.12 or higher
  • Dependencies: PyQt6, PyMuPDF, OpenCV, Pillow, numpy, deskew, qt-material, tqdm

Development / 开发

git clone https://github.com/tinnci/pdf_deskew.git
cd pdf_deskew
uv venv
# Windows: .venv\Scripts\activate | Linux/macOS: source .venv/bin/activate
uv pip install -e .
pytest

License / 许可证

This project is licensed under the MIT License.

Support / 支持

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_deskew-0.1.6.tar.gz (24.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf_deskew-0.1.6-py3-none-any.whl (23.8 kB view details)

Uploaded Python 3

File details

Details for the file pdf_deskew-0.1.6.tar.gz.

File metadata

  • Download URL: pdf_deskew-0.1.6.tar.gz
  • Upload date:
  • Size: 24.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pdf_deskew-0.1.6.tar.gz
Algorithm Hash digest
SHA256 828d4d39981b93383f47ae8b21ec1eacc6ab412dec31f9512b7fa70cdce85b07
MD5 64ca9b19fba5cc5ad0e50cf1876e7de1
BLAKE2b-256 4f6c0956112c375eab583f85754e2f8f0790a1d5dcc4e8ad9eb8da7ca0e2b8a0

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdf_deskew-0.1.6.tar.gz:

Publisher: release.yml on Tinnci/pdf_deskew

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pdf_deskew-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: pdf_deskew-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 23.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pdf_deskew-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 717bbbc7a3f840c1158462ce3bc31a94fdc8dbf9051e59357231b47bdb59e54f
MD5 297b165a7ec6e84a6f5a30e85c649c25
BLAKE2b-256 45ecb67b19f6710d589f0f508bdbb79ebb45ed555238b8190f3470ca01c73722

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdf_deskew-0.1.6-py3-none-any.whl:

Publisher: release.yml on Tinnci/pdf_deskew

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page