A modern command-line tool to split PDF files into smaller chunks with progress bars and automatic filename generation
Project description
PDF Splitter CLI
A modern command-line tool to split BIG PDF files into smaller chunks with real-time progress bars and automatic filename generation.
✨ Features
- 📄 Split PDF files by specified number of pages per chunk
- 🎯 Real-time progress bars showing file creation progress
- 📁 Smart filename generation based on original filename
- 🔢 Sequential numbering (e.g.,
document_1.pdf,document_2.pdf) - 📂 Configurable output folders
- 🖥️ Modern CLI with rich help and validation
- 📃 Individual page splitting support
- 🎨 Colorized output for better user experience
- 🛠️ Robust error handling with fallback methods (pdftk, qpdf)
- ⚡ Memory-efficient processing for large files
- 🔧 Cross-platform (Windows, macOS, Linux)
🚀 Installation
📥 For Non-Technical Users (Recommended)
Download the standalone executable - no Python installation required!
- Windows: Download
pdf-splitter.exefrom Releases
Quick Start with Executable:
# Windows
pdf-splitter.exe document.pdf
# macOS/Linux (make executable first)
chmod +x pdf-splitter
./pdf-splitter document.pdf
🐍 For Python Developers
pip install pdf-splitter-cli
Requirements: Python 3.8+
📖 Quick Start
# Basic usage - split every 5 pages (default)
pdf-splitter document.pdf
# Custom chunk size - split every 10 pages
pdf-splitter document.pdf -p 10
# Custom output folder
pdf-splitter document.pdf -o my_chunks
# Split into individual pages
pdf-splitter document.pdf -p 1
# Disable progress bars (useful for scripts)
pdf-splitter document.pdf --no-progress
📋 Usage
Command Structure
pdf-splitter <input_pdf> [OPTIONS]
Options
-p, --pages-per-chunk INTEGER: Pages per output file (default: 5)-o, --output-folder TEXT: Output folder (default: "output_chunks")--no-progress: Disable progress bars--help: Show help message
Examples
Basic Splitting
pdf-splitter document.pdf
Output: document_1.pdf, document_2.pdf, etc. in output_chunks/
Custom Page Count
pdf-splitter document.pdf -p 10
pdf-splitter document.pdf --pages-per-chunk 10
Custom Output Folder
pdf-splitter document.pdf -p 3 -o my_output
Individual Pages
pdf-splitter report.pdf -p 1
Output: report_1.pdf, report_2.pdf, etc. (one page each)
🎯 Progress Bars
The tool shows real-time progress as files are created:
Creating PDF files [████████████████████] 100% (8/8 files) 00:00:15
- File-based progress: Tracks each output file completion
- ETA display: Shows estimated time remaining
- Percentage complete: Visual progress indicator
- Disable option: Use
--no-progressfor scripting
🛠️ Advanced Features
Large File Support
- Memory-efficient processing for multi-GB files
- Automatic garbage collection after each chunk
- Error recovery continues processing if individual pages fail
- File size warnings for files >100MB
Fallback Methods
If the primary PyPDF method fails, the tool automatically tries:
- pdftk (if installed)
- qpdf (if installed)
Error Handling
- Graceful degradation for corrupted PDFs
- Detailed error messages with suggested solutions
- Partial processing continues even if some pages fail
📁 Output File Naming
Files are automatically named using the original filename:
| Input | Output |
|---|---|
document.pdf |
document_1.pdf, document_2.pdf, ... |
report.pdf |
report_1.pdf, report_2.pdf, ... |
/path/to/file.pdf |
file_1.pdf, file_2.pdf, ... |
� Building Executables
For Developers: Create Your Own Executables
You can build standalone executables for distribution:
Windows
# Clone and setup
git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync
# Build Windows executable
powershell -ExecutionPolicy Bypass -File build_executable.ps1
macOS/Linux
# Clone and setup
git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync
# Build executable
./build_executable.sh
Cross-Platform Python Script
# Build for current platform
python build_executables.py
Output: Executables are created in release/ folder with README files for distribution.
�🔧 Installation from Source
For development or latest features:
git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync # or pip install -e .
📄 License
MIT License - see LICENSE file for details.
🤝 Contributing
Contributions welcome! Please feel free to submit a Pull Request.
🐛 Issues
Found a bug or have a feature request? Please open an issue on GitHub.
📊 Dependencies
- click: Modern CLI framework
- pypdf: PDF processing library
🏷️ Version History
- 0.1.2: Updated PyPI page with complete README content including standalone executables
- 0.1.1: Added standalone executables for Windows, macOS, and Linux
- 0.1.0: Initial release with progress bars and robust error handling
📦 Distribution Options
For End Users
- Standalone Executables: Download from Releases
- No Python installation required
- Single file download
- Works immediately after download
For Developers
- PyPI Package:
pip install pdf-splitter-cli- Integrates with Python environments
- Easy to include in scripts
- Automatic dependency management
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdf_splitter_cli-0.1.2.tar.gz.
File metadata
- Download URL: pdf_splitter_cli-0.1.2.tar.gz
- Upload date:
- Size: 12.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43cb32002fbf8ada067f5dbbab280840962f6393e058a60e6223ed24480006b6
|
|
| MD5 |
f71c1796816577b299ed6be8bbc729ce
|
|
| BLAKE2b-256 |
0f3de4561b865a2c04928414c92a7725498cf66d3b4ada0842b40d8c545773c9
|
File details
Details for the file pdf_splitter_cli-0.1.2-py3-none-any.whl.
File metadata
- Download URL: pdf_splitter_cli-0.1.2-py3-none-any.whl
- Upload date:
- Size: 9.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95bbd5cfa161aa40276f88d7a2b32a23a2bfd73a20985570cff1cf08203aba57
|
|
| MD5 |
83517e61b4a8b637fae38794554e6e4f
|
|
| BLAKE2b-256 |
5d694fc0f24ba6b98676f7050c001e2e56942e703513e4a3fd57d2a204c4b42b
|