Skip to main content

A modern command-line tool to split PDF files into smaller chunks with progress bars and automatic filename generation

Project description

PDF Splitter CLI

PyPI version Python Support License: MIT

A modern command-line tool to split PDF files into smaller chunks with real-time progress bars and automatic filename generation.

✨ Features

  • 📄 Split PDF files by specified number of pages per chunk
  • 🎯 Real-time progress bars showing file creation progress
  • 📁 Smart filename generation based on original filename
  • 🔢 Sequential numbering (e.g., document_1.pdf, document_2.pdf)
  • 📂 Configurable output folders
  • 🖥️ Modern CLI with rich help and validation
  • 📃 Individual page splitting support
  • 🎨 Colorized output for better user experience
  • 🛠️ Robust error handling with fallback methods (pdftk, qpdf)
  • Memory-efficient processing for large files
  • 🔧 Cross-platform (Windows, macOS, Linux)

🚀 Installation

pip install pdf-splitter-cli

Requirements: Python 3.8+

📖 Quick Start

# Basic usage - split every 5 pages (default)
pdf-splitter document.pdf

# Custom chunk size - split every 10 pages  
pdf-splitter document.pdf -p 10

# Custom output folder
pdf-splitter document.pdf -o my_chunks

# Split into individual pages
pdf-splitter document.pdf -p 1

# Disable progress bars (useful for scripts)
pdf-splitter document.pdf --no-progress

📋 Usage

Command Structure

pdf-splitter <input_pdf> [OPTIONS]

Options

  • -p, --pages-per-chunk INTEGER: Pages per output file (default: 5)
  • -o, --output-folder TEXT: Output folder (default: "output_chunks")
  • --no-progress: Disable progress bars
  • --help: Show help message

Examples

Basic Splitting

pdf-splitter document.pdf

Output: document_1.pdf, document_2.pdf, etc. in output_chunks/

Custom Page Count

pdf-splitter document.pdf -p 10
pdf-splitter document.pdf --pages-per-chunk 10

Custom Output Folder

pdf-splitter document.pdf -p 3 -o my_output

Individual Pages

pdf-splitter report.pdf -p 1

Output: report_1.pdf, report_2.pdf, etc. (one page each)

🎯 Progress Bars

The tool shows real-time progress as files are created:

Creating PDF files [████████████████████] 100% (8/8 files) 00:00:15
  • File-based progress: Tracks each output file completion
  • ETA display: Shows estimated time remaining
  • Percentage complete: Visual progress indicator
  • Disable option: Use --no-progress for scripting

🛠️ Advanced Features

Large File Support

  • Memory-efficient processing for multi-GB files
  • Automatic garbage collection after each chunk
  • Error recovery continues processing if individual pages fail
  • File size warnings for files >100MB

Fallback Methods

If the primary PyPDF method fails, the tool automatically tries:

  1. pdftk (if installed)
  2. qpdf (if installed)

Error Handling

  • Graceful degradation for corrupted PDFs
  • Detailed error messages with suggested solutions
  • Partial processing continues even if some pages fail

📁 Output File Naming

Files are automatically named using the original filename:

Input Output
document.pdf document_1.pdf, document_2.pdf, ...
report.pdf report_1.pdf, report_2.pdf, ...
/path/to/file.pdf file_1.pdf, file_2.pdf, ...

🔧 Installation from Source

For development or latest features:

git clone https://github.com/jmxt3/pdf-splitter.git
cd pdf-splitter
pip install -e .

📄 License

MIT License - see LICENSE file for details.

🤝 Contributing

Contributions welcome! Please feel free to submit a Pull Request.

🐛 Issues

Found a bug or have a feature request? Please open an issue on GitHub.

📊 Dependencies

  • click: Modern CLI framework
  • pypdf: PDF processing library

🏷️ Version History

  • 0.1.0: Initial release with progress bars and robust error handling

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_splitter_cli-0.1.0.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf_splitter_cli-0.1.0-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file pdf_splitter_cli-0.1.0.tar.gz.

File metadata

  • Download URL: pdf_splitter_cli-0.1.0.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for pdf_splitter_cli-0.1.0.tar.gz
Algorithm Hash digest
SHA256 9929c5909ec2a0671d6c5a3088cdd3cc2c9ca7692de0b7460bb57a65e2932a15
MD5 fa72db3ffe414466ca55772ec784afce
BLAKE2b-256 ebf72eaaa27866d7ed07457f7c661a792569121238744d007f50d8e1c34e488c

See more details on using hashes here.

File details

Details for the file pdf_splitter_cli-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pdf_splitter_cli-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c5626038d06640629ef9e9cfb743378ef0f64ff0e18fed633282264ded6e5147
MD5 d8e0becce76de88f55ecf69315d48c24
BLAKE2b-256 3bec9861b82636b28b05fc05d856b3cd873f080ae7de62d94928a05f228bce06

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page