Skip to main content

A modern command-line tool to split PDF files into smaller chunks with progress bars and automatic filename generation

Project description

PDF Splitter CLI

PyPI version Python Support License: MIT

PDF Splitter Logo

A modern command-line tool to split BIG PDF files into smaller chunks with real-time progress bars and automatic filename generation.

✨ Features

  • 📄 Split PDF files by specified number of pages per chunk
  • 🎯 Real-time progress bars showing file creation progress
  • 📁 Smart filename generation based on original filename
  • 🔢 Sequential numbering (e.g., document_1.pdf, document_2.pdf)
  • 📂 Configurable output folders
  • 🖥️ Modern CLI with rich help and validation
  • 📃 Individual page splitting support
  • 🎨 Colorized output for better user experience
  • 🛠️ Robust error handling with fallback methods (pdftk, qpdf)
  • Memory-efficient processing for large files
  • 🔧 Cross-platform (Windows, macOS, Linux)

🚀 Installation

📥 For Non-Technical Users (Recommended)

Download the standalone executable - no Python installation required!

  1. Windows: Download pdf-splitter.exe from Releases

Quick Start with Executable:

# Windows
pdf-splitter.exe document.pdf

# macOS/Linux (make executable first)
chmod +x pdf-splitter
./pdf-splitter document.pdf

🐍 For Python Developers

pip install pdf-splitter-cli

Requirements: Python 3.8+

📖 Quick Start

# Basic usage - split every 5 pages (default)
pdf-splitter document.pdf

# Custom chunk size - split every 10 pages
pdf-splitter document.pdf -p 10

# Custom output folder
pdf-splitter document.pdf -o my_chunks

# Split into individual pages
pdf-splitter document.pdf -p 1

# Disable progress bars (useful for scripts)
pdf-splitter document.pdf --no-progress

📋 Usage

Command Structure

pdf-splitter <input_pdf> [OPTIONS]

Options

  • -p, --pages-per-chunk INTEGER: Pages per output file (default: 5)
  • -o, --output-folder TEXT: Output folder (default: "output_chunks")
  • --no-progress: Disable progress bars
  • --help: Show help message

Examples

Basic Splitting

pdf-splitter document.pdf

Output: document_1.pdf, document_2.pdf, etc. in output_chunks/

Custom Page Count

pdf-splitter document.pdf -p 10
pdf-splitter document.pdf --pages-per-chunk 10

Custom Output Folder

pdf-splitter document.pdf -p 3 -o my_output

Individual Pages

pdf-splitter report.pdf -p 1

Output: report_1.pdf, report_2.pdf, etc. (one page each)

🎯 Progress Bars

The tool shows real-time progress as files are created:

Creating PDF files [████████████████████] 100% (8/8 files) 00:00:15
  • File-based progress: Tracks each output file completion
  • ETA display: Shows estimated time remaining
  • Percentage complete: Visual progress indicator
  • Disable option: Use --no-progress for scripting

🛠️ Advanced Features

Large File Support

  • Memory-efficient processing for multi-GB files
  • Automatic garbage collection after each chunk
  • Error recovery continues processing if individual pages fail
  • File size warnings for files >100MB

Fallback Methods

If the primary PyPDF method fails, the tool automatically tries:

  1. pdftk (if installed)
  2. qpdf (if installed)

Error Handling

  • Graceful degradation for corrupted PDFs
  • Detailed error messages with suggested solutions
  • Partial processing continues even if some pages fail

📁 Output File Naming

Files are automatically named using the original filename:

Input Output
document.pdf document_1.pdf, document_2.pdf, ...
report.pdf report_1.pdf, report_2.pdf, ...
/path/to/file.pdf file_1.pdf, file_2.pdf, ...

� Building Executables

For Developers: Create Your Own Executables

You can build standalone executables for distribution:

Windows

# Clone and setup
git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync

# Build Windows executable
powershell -ExecutionPolicy Bypass -File build_executable.ps1

macOS/Linux

# Clone and setup
git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync

# Build executable
./build_executable.sh

Cross-Platform Python Script

# Build for current platform
python build_executables.py

Output: Executables are created in release/ folder with README files for distribution.

�🔧 Installation from Source

For development or latest features:

git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
uv sync  # or pip install -e .

📄 License

MIT License - see LICENSE file for details.

🤝 Contributing

Contributions welcome! Please feel free to submit a Pull Request.

🐛 Issues

Found a bug or have a feature request? Please open an issue on GitHub.

📊 Dependencies

  • click: Modern CLI framework
  • pypdf: PDF processing library

🏷️ Version History

  • 0.1.2: Updated PyPI page with complete README content including standalone executables
  • 0.1.1: Added standalone executables for Windows, macOS, and Linux
  • 0.1.0: Initial release with progress bars and robust error handling

📦 Distribution Options

For End Users

  • Standalone Executables: Download from Releases
    • No Python installation required
    • Single file download
    • Works immediately after download

For Developers

  • PyPI Package: pip install pdf-splitter-cli
    • Integrates with Python environments
    • Easy to include in scripts
    • Automatic dependency management

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_splitter_cli-0.1.2.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf_splitter_cli-0.1.2-py3-none-any.whl (9.5 kB view details)

Uploaded Python 3

File details

Details for the file pdf_splitter_cli-0.1.2.tar.gz.

File metadata

  • Download URL: pdf_splitter_cli-0.1.2.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pdf_splitter_cli-0.1.2.tar.gz
Algorithm Hash digest
SHA256 43cb32002fbf8ada067f5dbbab280840962f6393e058a60e6223ed24480006b6
MD5 f71c1796816577b299ed6be8bbc729ce
BLAKE2b-256 0f3de4561b865a2c04928414c92a7725498cf66d3b4ada0842b40d8c545773c9

See more details on using hashes here.

File details

Details for the file pdf_splitter_cli-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for pdf_splitter_cli-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 95bbd5cfa161aa40276f88d7a2b32a23a2bfd73a20985570cff1cf08203aba57
MD5 83517e61b4a8b637fae38794554e6e4f
BLAKE2b-256 5d694fc0f24ba6b98676f7050c001e2e56942e703513e4a3fd57d2a204c4b42b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page