Skip to main content

cli tool wrapper for manipulating multiple pdfs file in parallel using ghostscript (compressing, convert to pdfa, etc.)

Project description

gs-batch-pdf

gs-batch-pdf is a command-line tool for batch (parallel) processing PDF files using Ghostscript, applying the same set of gs options to all files specified while taking care of file renaming.

It offers convenient default settings for compression, PDF/A conversion, and you can also apply any custom Ghostscript options.

Features

  • Batch process multiple PDF files
  • Compress PDFs with various quality settings
  • Convert PDFs to PDF/A format[^1]
  • Apply custom Ghostscript options
  • Multi-threaded processing for improved performance
  • Progress tracking with tqdm
  • Automatic file renaming with customizable prefixes and suffixes
  • Option to keep either the smaller file or the new file after processing
  • Cross-platform support (Windows, Linux, macOS)

[^1]: you need to use gs version 10.04.0 or higher for correct PDF/A level 2 or 3 conversion.

Installation

To install gs-batch-pdf, make sure you have Python 3.12+ and pipx[^2] installed, then run:

[^2]:pipx will let you install the package in a virtual environment, but the commands will be available from the command line

pipx install gs-batch-pdf

Note: This tool requires Ghostscript to be installed on your system. Make sure you have Ghostscript installed and accessible from the command line.

Usage

Basic usage:

gs_batch and the its alias gsb will be available from the command line.

gs-batch-pdf [OPTIONS] FILES...

Options:

  • --options TEXT: Arbitrary Ghostscript options and switches.
  • --compress TEXT: Compression quality level (e.g., /screen, [/ebook], /printer, /prepress, /default).
  • --pdfa INTEGER: PDF/A version (1 for PDF/A-1, 2 for [PDF/A-2], 3 for PDF/A-3).
  • --prefix TEXT: Prefix to add to the output file name.
  • --suffix TEXT: Suffix to add to the output file name before the extension.
  • --keep_smaller / --keep_new: Keep the smaller file between old and new [default: keep_smaller].
  • --force: Allow overwriting the original file.
  • --open_path / --no_open_path: Open the output file path in the filesystem.
  • --filter TEXT: Filter input files by extension; could be comma-separated. (e.g., 'pdf,png') [default: pdf]
  • --help: Show this message and exit.

Examples

  1. Compress multiple PDF files using ebook quality in place (overwrite)[^3]:

[^3]: When no --prefix is provided if --force has not being raised, you will be prompt for permission to overwrite the original files.

gs_batch --compress=/ebook file1.pdf file2.pdf file3.pdf
  1. Convert PDFs to PDF/A-2 format in place (overwrite):
gs_batch --pdfa=2 file1.pdf file2.pdf
  1. Compress and Convert PDFs to PDF/A-2 format all pdfs in a folder with glob patterns (by default will filter the file list by the pdf extension):
# will find all pdfs in the current folder
gs_batch --compress --pdfa --force *  

# will find alls pdfs in folder and subfolder recursively
gs_batch --compress --pdfa --force **/* 
  1. Apply custom Ghostscript options:
gs_batch --options="-dCompatibilityLevel=1.4 -dColorImageResolution=72" file.pdf
  1. Add prefix^[you can also specify new folder] and suffix to output files:
gs_batch --prefix="./compressed/" --suffix="_v1" --compress=/screen file*.pdf 

Output

After processing, gs-batch-pdf will display a summary table showing the original size, new size, compression ratio, and which file was kept for each processed PDF. The tool will also attempt to open the output folder in your default file manager.

Requirements

  • Python 3.12+
  • Ghostscript
  • click
  • tqdm
  • showinfm

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License.

Acknowledgements

gs-batch-pdf uses Ghostscript for PDF processing. Ghostscript is released under the GNU Affero General Public License (AGPL).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gs_batch_pdf-0.4.0.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

gs_batch_pdf-0.4.0-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file gs_batch_pdf-0.4.0.tar.gz.

File metadata

  • Download URL: gs_batch_pdf-0.4.0.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.12 Linux/6.9.3-76060903-generic

File hashes

Hashes for gs_batch_pdf-0.4.0.tar.gz
Algorithm Hash digest
SHA256 14344948aab7b405d1e2d49bea247db930c94d9d60849ebc902a78b5db38c3fb
MD5 5b2912641c89c365c537caa3a298d600
BLAKE2b-256 88e61c399f720647ffb00d63417a5ed9d54dc4094d57458ac8948dd8efa20b7b

See more details on using hashes here.

File details

Details for the file gs_batch_pdf-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: gs_batch_pdf-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.2 CPython/3.10.12 Linux/6.9.3-76060903-generic

File hashes

Hashes for gs_batch_pdf-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 afa20c7ced1ee89b8fd6bd70b980108497d1a04297af4293a92bd9b2badac8f1
MD5 ce842164733cc0ba6a348880067d8191
BLAKE2b-256 17e50338c0a556b823fb861d06fc87878b51cd02605f87a16ecded660be5316e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page