cli tool wrapper for manipulating multiple pdfs file in parallel using ghostscript (compressing, convert to pdfa, etc.)
Project description
gs-batch-pdf
A command-line tool for batch processing PDF files using Ghostscript with parallel execution. Process multiple PDFs simultaneously while applying compression, PDF/A conversion, or custom Ghostscript options.
Features
- Parallel Processing: Multi-threaded execution for faster batch operations
- Compression: Multiple quality levels (/screen, /ebook, /printer, /prepress, /default)
- PDF/A Conversion: Support for PDF/A-1, PDF/A-2, and PDF/A-3 standards[^1]
- Recursive Search: Process entire directory trees with the
-rflag - Smart File Management: Keep smaller files automatically or always keep new versions
- Error Handling: Configurable error behavior (prompt, skip, or abort)
- Custom Ghostscript Options: Full access to Ghostscript's command-line options
- Progress Tracking: Real-time progress bars for each file being processed
- Flexible Output: Add prefixes/suffixes to output filenames, organize into folders
- Cross-platform: Windows, Linux, and macOS support
[^1]: Requires Ghostscript version 10.04.0 or higher for correct PDF/A-2 and PDF/A-3 conversion.
Installation
Prerequisites
- Python 3.12+: Required to run the tool
- Ghostscript: Required for PDF processing. Install from ghostscript.com
Install gs-batch-pdf
Using pipx (recommended)[^2]:
pipx install gs-batch-pdf
Or using pip:
pip install gs-batch-pdf
[^2]: pipx installs the package in an isolated virtual environment while making commands globally available.
Usage
The tool is available via two commands: gs_batch or its shorter alias gsb.
Basic Syntax
gsb [OPTIONS] FILES_OR_DIRECTORIES...
or (my preference):
gsb FILES_OR_DIRECTORIES... [OPTIONS]
Quick Start
# Compress all PDFs in current directory (default: /ebook quality)
gsb . --compress
# Compress PDFs recursively in a directory tree
gsb ./docs/ -r --compress
# Convert a single PDF to PDF/A-2
gsb file.pdf --pdfa
# Compress and convert to PDF/A with custom output
gsb *.pdf --compress --pdfa --prefix "processed_"
Note: When using options that can take optional values (like
--compressor--pdfa), place them after the file arguments for simplest usage, or see Using Options with File Arguments for alternatives.
Options
Processing Options
-
--compress [LEVEL]: Compress PDFs with quality level- Levels:
/screen,/ebook(default),/printer,/prepress,/default - Use without value for
/ebookquality
- Levels:
-
--pdfa [VERSION]: Convert to PDF/A format- Versions:
1(PDF/A-1),2(PDF/A-2, default),3(PDF/A-3) - Use without value for PDF/A-2
- Versions:
-
--options TEXT: Pass arbitrary Ghostscript options- Example:
--options "-dColorImageResolution=100 -dCompatibilityLevel=1.4"
- Example:
File Management Options
-
--prefix TEXT: Add prefix to output filenames- Can include path:
--prefix "output/"creates files in output directory - Relative paths calculated from input file location, not current directory
- Can include path:
-
--suffix TEXT: Add suffix before file extension- Example:
--suffix "_compressed"→file_compressed.pdf
- Example:
-
--keep_smaller/--keep_new: Choose which file to keep (default:--keep_smaller)--keep_smaller: Keep whichever file is smaller (original or processed)--keep_new: Always keep the processed file- Note: PDF/A conversion always keeps new file
-
-f, --force: Allow overwriting original files without confirmation- Required when no prefix specified and files would be overwritten
Search Options
-
--filter TEXT: Filter files by extension (default:pdf)- Supports comma-separated list:
--filter pdf,png
- Supports comma-separated list:
-
-r, --recursive: Search directories recursively- Without this flag, only processes files in top-level directories
Error Handling Options
--on-error [MODE]: Control behavior when file processing errors occurprompt(default): Interactively ask user whether to retry, skip, or abortskip: Automatically skip failed files and continue processingabort: Stop processing immediately on first error
Other Options
--timeout INTEGER: Maximum processing time per file in seconds (default: 300)- Set to
0to disable timeout protection - Prevents indefinite hangs on problematic PDFs
- Set to
--open_path/--no_open_path: Open output location in file manager (default: enabled)-v, --verbose: Show detailed Ghostscript command output--version: Show version information--help: Display help message
Using Options with File Arguments
When using options that accept optional values (--compress, --pdfa), you have three approaches:
1. Place options after file arguments (recommended):
gsb *.pdf --compress
gsb * --compress --pdfa
2. Provide explicit values:
gsb --compress /ebook *.pdf
gsb --pdfa 2 *.pdf
3. Use -- separator:
gsb --compress -- *.pdf
gsb --pdfa -- *.pdf
Examples
Basic Compression
Compress multiple PDFs with /ebook quality (in-place)[^3]:
gsb file1.pdf file2.pdf file3.pdf --compress
Compress all PDFs in a directory:
gsb . --compress
Compress with specific quality level:
gsb document.pdf --compress /screen
# or with explicit value before files:
gsb --compress /screen *.pdf
[^3]: When no --prefix is provided and files would be overwritten, you'll be prompted for confirmation unless --force is used.
PDF/A Conversion
Convert to PDF/A-2 (default):
gsb report.pdf --pdfa
Convert to specific PDF/A version:
gsb document.pdf --pdfa 3
# or with explicit value before files:
gsb --pdfa 3 *.pdf
Compress and convert to PDF/A:
gsb invoice.pdf --compress --pdfa
Recursive Processing
Process entire directory tree:
# Find and compress all PDFs in current directory and subdirectories
gsb . -r --compress
# Process specific directory recursively
gsb ./documents/ -r --compress --pdfa
Process with force (no confirmation):
gsb . -r --compress --pdfa --force
Custom Output Organization
Add prefix to create organized output:
# Add prefix to filenames
gsb *.pdf --prefix "compressed_" --compress
# Create files in subdirectory
gsb *.pdf --prefix "output/" --compress
# Add both prefix and suffix
gsb *.pdf --prefix "processed_" --suffix "_v1" --compress
Keep new files regardless of size:
gsb document.pdf --compress --keep_new
Advanced Ghostscript Options
Apply custom Ghostscript settings:
gsb file.pdf --options "-dCompatibilityLevel=1.4 -dColorImageResolution=72"
Combine compression with custom options:
gsb report.pdf --compress /printer --options "-dCompatibilityLevel=1.7"
Error Handling
Interactive error handling (default):
# Prompts user on each error to retry, skip, or abort
gsb . -r --compress
Skip failed files automatically:
# Useful for batch processing where some files may be corrupted
gsb . -r --compress --on-error skip
Abort on first error:
# Stops immediately if any file fails (useful for CI/CD)
gsb *.pdf --compress --on-error abort
Scripting and Automation
Silent processing for scripts:
gsb *.pdf --compress --force --no_open_path
Automated batch with error skipping:
# Best for unattended processing
gsb . -r --compress --force --no_open_path --on-error skip
Verbose output for debugging:
gsb document.pdf -v --compress --pdfa
Process with custom timeout for large files:
# Set 10 minute timeout for very large PDFs
gsb large-document.pdf --compress --timeout 600
# Disable timeout for files that take a long time
gsb complex-document.pdf --compress --timeout 0
Output
After processing completes, gs-batch-pdf displays a detailed summary table:
Processing 3 file(s):
1) document1.pdf
2) document2.pdf
3) document3.pdf
[Progress bars shown during processing...]
# | Original | New | Ratio | Keeping | Filename
1 | 1,234 KB | 856 KB | 69.400% | new | /path/to/document1.pdf
2 | 789 KB | 654 KB | 82.900% | new | /path/to/document2.pdf
3 | 456 KB | 512 KB | 112.300% | original | /path/to/document3.pdf
Total time: 12.34 seconds
The summary shows:
- Original: Size of the input file
- New: Size of the processed file
- Ratio: New size as percentage of original (lower is better for compression)
- Keeping: Which version was kept based on
--keep_smalleror--keep_new - Filename: Absolute path to the output file
By default, the tool opens the output location in your file manager after processing (disable with --no_open_path).
Troubleshooting
Ghostscript Not Found
If you get an error about Ghostscript not being found:
- Verify Ghostscript is installed:
gs --version(Linux/macOS) orgswin64c --version(Windows) - Ensure Ghostscript is in your system PATH
- On Windows, you may need to restart your terminal after installation
PDF/A Conversion Issues
For PDF/A-2 and PDF/A-3 conversion, ensure you're using Ghostscript 10.04.0 or higher:
gs --version
Permission Errors
If you encounter permission errors when processing files:
- Use
--prefixto write to a different directory - Check file permissions on both input and output locations
- On Windows, ensure files aren't open in another program
Timeout Issues
If processing is hanging or taking too long:
- The default timeout is 5 minutes (300 seconds) per file
- For large or complex PDFs, increase the timeout:
--timeout 600(10 minutes) - To disable timeout protection:
--timeout 0 - Some corrupted PDFs may cause Ghostscript to hang indefinitely - timeout protection will terminate these processes
Contributing
Contributions are welcome! Please feel free to:
- Report bugs or request features via GitHub Issues
- Submit Pull Requests for improvements
- Share feedback and suggestions
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgements
This tool is built on top of Ghostscript, released under the GNU Affero General Public License (AGPL). gs-batch-pdf is a CLI wrapper and is independently licensed under MIT.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gs_batch_pdf-0.6.2.tar.gz.
File metadata
- Download URL: gs_batch_pdf-0.6.2.tar.gz
- Upload date:
- Size: 21.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
de50e7e031fc0af76ac431813e959041097ce5cf73bb079bbdb223d6ea148986
|
|
| MD5 |
db2f943cf6eb4b53d394df7ef960a4d7
|
|
| BLAKE2b-256 |
ace3e151684e04d6eb15f7f892622efa13a23969453bbd635284fe440e7dba83
|
Provenance
The following attestation bundles were made for gs_batch_pdf-0.6.2.tar.gz:
Publisher:
release.yml on kompre/gs-batch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gs_batch_pdf-0.6.2.tar.gz -
Subject digest:
de50e7e031fc0af76ac431813e959041097ce5cf73bb079bbdb223d6ea148986 - Sigstore transparency entry: 710081003
- Sigstore integration time:
-
Permalink:
kompre/gs-batch@d5803e58b12e9909453104531086e4f8bb03823f -
Branch / Tag:
refs/heads/main - Owner: https://github.com/kompre
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d5803e58b12e9909453104531086e4f8bb03823f -
Trigger Event:
pull_request
-
Statement type:
File details
Details for the file gs_batch_pdf-0.6.2-py3-none-any.whl.
File metadata
- Download URL: gs_batch_pdf-0.6.2-py3-none-any.whl
- Upload date:
- Size: 21.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
54345947b9cbf8ae05b64373734da19c7264b4a2756f2926d29750e9dd97781a
|
|
| MD5 |
1d2cce9757c1c1c760a99144caa76ad4
|
|
| BLAKE2b-256 |
fa556b7195dec561c2bc7e3b888c70dae983978cdd6a3f910ac756b5abf38bf5
|
Provenance
The following attestation bundles were made for gs_batch_pdf-0.6.2-py3-none-any.whl:
Publisher:
release.yml on kompre/gs-batch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gs_batch_pdf-0.6.2-py3-none-any.whl -
Subject digest:
54345947b9cbf8ae05b64373734da19c7264b4a2756f2926d29750e9dd97781a - Sigstore transparency entry: 710081006
- Sigstore integration time:
-
Permalink:
kompre/gs-batch@d5803e58b12e9909453104531086e4f8bb03823f -
Branch / Tag:
refs/heads/main - Owner: https://github.com/kompre
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d5803e58b12e9909453104531086e4f8bb03823f -
Trigger Event:
pull_request
-
Statement type: