Skip to main content

A modern command-line tool to split PDF files into smaller chunks with progress bars and automatic filename generation

Project description

PDF Splitter CLI

PyPI version Python Support License: MIT

PDF Splitter Logo

A modern command-line tool to split BIG PDF files into smaller chunks with real-time progress bars and automatic filename generation.

✨ Features

  • 📄 Split PDF files by specified number of pages per chunk
  • 🎯 Real-time progress bars showing file creation progress
  • 📁 Smart filename generation based on original filename
  • 🔢 Sequential numbering (e.g., document_1.pdf, document_2.pdf)
  • 📂 Configurable output folders
  • 🖥️ Modern CLI with rich help and validation
  • 📃 Individual page splitting support
  • 🎨 Colorized output for better user experience
  • 🛠️ Robust error handling with fallback methods (pdftk, qpdf)
  • Memory-efficient processing for large files
  • 🔧 Cross-platform (Windows, macOS, Linux)

🚀 Installation

📥 For Non-Technical Users (Recommended)

Download the standalone executable - no Python installation required!

  1. Windows: Download pdf-splitter.exe from Releases

Quick Start with Executable:

# Windows
pdf-splitter.exe document.pdf

# macOS/Linux (make executable first)
chmod +x pdf-splitter
./pdf-splitter document.pdf

🐍 For Python Developers

pip install pdf-splitter-cli

Requirements: Python 3.8+

📖 Quick Start

# Basic usage - split every 5 pages (default)
pdf-splitter document.pdf

# Custom chunk size - split every 10 pages
pdf-splitter document.pdf -p 10

# Custom output folder
pdf-splitter document.pdf -o my_chunks

# Split into individual pages
pdf-splitter document.pdf -p 1

# Disable progress bars (useful for scripts)
pdf-splitter document.pdf --no-progress

📋 Usage

Command Structure

pdf-splitter <input_pdf> [OPTIONS]

Options

  • -p, --pages-per-chunk INTEGER: Pages per output file (default: 5)
  • -o, --output-folder TEXT: Output folder (default: "output_chunks")
  • --no-progress: Disable progress bars
  • --help: Show help message

Examples

Basic Splitting

pdf-splitter document.pdf

Output: document_1.pdf, document_2.pdf, etc. in output_chunks/

Custom Page Count

pdf-splitter document.pdf -p 10
pdf-splitter document.pdf --pages-per-chunk 10

Custom Output Folder

pdf-splitter document.pdf -p 3 -o my_output

Individual Pages

pdf-splitter report.pdf -p 1

Output: report_1.pdf, report_2.pdf, etc. (one page each)

🎯 Progress Bars

The tool shows real-time progress as files are created:

Creating PDF files [████████████████████] 100% (8/8 files) 00:00:15
  • File-based progress: Tracks each output file completion
  • ETA display: Shows estimated time remaining
  • Percentage complete: Visual progress indicator
  • Disable option: Use --no-progress for scripting

🛠️ Advanced Features

Large File Support

  • Memory-efficient processing for multi-GB files
  • Automatic garbage collection after each chunk
  • Error recovery continues processing if individual pages fail
  • File size warnings for files >100MB

Fallback Methods

If the primary PyPDF method fails, the tool automatically tries:

  1. pdftk (if installed)
  2. qpdf (if installed)

Error Handling

  • Graceful degradation for corrupted PDFs
  • Detailed error messages with suggested solutions
  • Partial processing continues even if some pages fail

📁 Output File Naming

Files are automatically named using the original filename:

Input Output
document.pdf document_1.pdf, document_2.pdf, ...
report.pdf report_1.pdf, report_2.pdf, ...
/path/to/file.pdf file_1.pdf, file_2.pdf, ...

� Building Executables

For Developers: Create Your Own Executables

You can build standalone executables for distribution:

Windows

# Clone and setup
git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync

# Build Windows executable
powershell -ExecutionPolicy Bypass -File build_executable.ps1

macOS/Linux

# Clone and setup
git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync

# Build executable
./build_executable.sh

Cross-Platform Python Script

# Build for current platform
python build_executables.py

Output: Executables are created in release/ folder with README files for distribution.

�🔧 Installation from Source

For development or latest features:

git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync  # or pip install -e .

📄 License

MIT License - see LICENSE file for details.

🤝 Contributing

Contributions welcome! Please feel free to submit a Pull Request.

🐛 Issues

Found a bug or have a feature request? Please open an issue on GitHub.

📊 Dependencies

  • click: Modern CLI framework
  • pypdf: PDF processing library

🏷️ Version History

  • 0.1.2: Updated PyPI page with complete README content including standalone executables
  • 0.1.1: Added standalone executables for Windows, macOS, and Linux
  • 0.1.0: Initial release with progress bars and robust error handling

📦 Distribution Options

For End Users

  • Standalone Executables: Download from Releases
    • No Python installation required
    • Single file download
    • Works immediately after download

For Developers

  • PyPI Package: pip install pdf-splitter-cli
    • Integrates with Python environments
    • Easy to include in scripts
    • Automatic dependency management

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_splitter_cli-0.1.3.tar.gz (12.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf_splitter_cli-0.1.3-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file pdf_splitter_cli-0.1.3.tar.gz.

File metadata

  • Download URL: pdf_splitter_cli-0.1.3.tar.gz
  • Upload date:
  • Size: 12.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pdf_splitter_cli-0.1.3.tar.gz
Algorithm Hash digest
SHA256 02b6cc7ca719ae1cd97caf36381876ce1b6073c11553a6e626afaa6f50cae2fe
MD5 88e9f2186204fb4ad3a7883a5acc2ce0
BLAKE2b-256 4c7881a570eb01001c57555d2d4b073ff53737f8ddff5e29e0aeee604fda4385

See more details on using hashes here.

File details

Details for the file pdf_splitter_cli-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for pdf_splitter_cli-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a67fdf38a805a50a573ee55f08c69b40e6338f5121508c031eadd2d803212091
MD5 c457214580bc2b6da2195e5dc2195b59
BLAKE2b-256 8c00d07ea1769313f71b5f23ce7cca7398e0c519f042fef2322b4d2190c980bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page