Skip to main content

Fuse any document, any format — PDF, DOCX, PPTX, images and more

Project description

PageFuse

Fuse any document, any format. Combine pages from PDFs, Word docs, PowerPoint slides, images, and more into a single output.

Supported Input Formats

Format Extensions Requires LibreOffice
PDF .pdf No
Images .png, .jpg, .jpeg, .tiff No
Word .docx, .doc Yes
PowerPoint .pptx, .ppt Yes
OpenDocument .odt, .odp, .ods Yes
Other .rtf, .html, .csv, .xlsx, .xls Yes

Supported Output Formats

Output format is determined by the file extension in your config or command:

Format Extension Requires LibreOffice Notes
PDF .pdf No Default — fast, lossless
Image .png, .jpg, .tiff No Single page → file; multi-page → ZIP
Word .docx, .odt Yes
PowerPoint .pptx, .odp Yes
Web .html Yes

Installation

# Linux (recommended — avoids system Python restrictions)
pipx install pagefuse

# macOS
pip install pagefuse

# Windows
pip install pagefuse

# Or inside a virtual environment (any platform)
python3 -m venv venv && source venv/bin/activate
pip install pagefuse

Linux note: If you see error: externally-managed-environment, use pipx instead of pip. Install pipx with: sudo apt install pipx && pipx ensurepath

Uninstall

pipx uninstall pagefuse   # if installed via pipx
pip uninstall pagefuse    # if installed via pip

LibreOffice is required only for Office/OpenDocument/HTML formats. PDF and image formats work on all platforms without it.

# Ubuntu / Debian
sudo apt install libreoffice

# macOS
brew install --cask libreoffice

# Windows
# Download from https://www.libreoffice.org/download and add soffice.exe to PATH

Usage

Quick assembly (no config file)

# Assemble into a PDF
pagefuse quick output.pdf cover.pdf:1 terms.docx:all pricing.pdf:1-3 slides.pptx:2,4,6

# Assemble and export as Word document
pagefuse quick output.docx cover.pdf:1 terms.docx:all

# Assemble and export as images (multi-page → output.zip)
pagefuse quick output.png report.pdf:1-3

Each source is file:pages. Omit :pages to include all pages.

Config file assembly

pagefuse assemble board_pack.fuse

Example board_pack.fuse:

# Output format is determined by the extension (.pdf, .docx, .html, .png, …)
# Add multiple output: lines to export to several formats in one run.
output: board_pack.pdf
output: board_pack.docx
output: board_pack_preview.png

# Metadata (title defaults to output filename if omitted)
title:   Q4 Board Pack
author:  Finance Team
subject: Board meeting materials

file: templates/cover_letter.pdf       1
file: reports/financial_data.docx      all
file: slides/main_deck.pptx            1-4
file: reports/charts.pdf               3,5,7
file: templates/signature_page.pdf     1

Generate a template config:

pagefuse init                        # assemble config (default) → config.fuse
pagefuse init --split                # split config → config.fuse
pagefuse init --output my.fuse       # custom filename
pagefuse init --split --output split.fuse

Split a document into parts

Inline (no config file):

pagefuse split report.pdf cover.pdf:1 body.pdf:2-10 appendix.pdf:11-20

# Each output can be a different format
pagefuse split report.pdf summary.pdf:1 full.docx:all preview.png:1

Or use a .fuse config file:

pagefuse split split.fuse

Example split.fuse:

source: annual_report.pdf

output: cover.pdf              1
output: executive_summary.pdf  2-5
output: financials.pdf         6-20
output: appendix.docx          21-30
output: cover_preview.png      1

Each output is file:pages. Omit :pages to copy all pages.

Inspect a document

pagefuse info report.pdf
pagefuse info slides.pptx
pagefuse info photo.png

Page Specification Syntax

Spec Meaning
all Every page
5 Page 5 only
1-3 Pages 1 through 3
1,3,5 Pages 1, 3, and 5
1-3,5,7-9 Mixed ranges and singles

Page numbers are 1-based.

Development

git clone https://github.com/raptorgold14/pagefuse.git
cd pagefuse
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
pip install -e .

Run tests:

pytest

See examples/ for sample .fuse configs and examples/generate_pdfs.py to regenerate fixture files.

Roadmap

Phase Scope Status
1 — CLI MVP PDF, DOCX, PPTX, images, OpenDocument, multi-format output ✅ Done
2 — Validate Launch, gather feedback, iterate 🔜 Next
3 — Native formats Pure Python DOCX/PPTX extraction (no LibreOffice dep) Planned
4 — GUI Tauri-based GUI wrapping the CLI Planned

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pagefuse-0.1.0.tar.gz (21.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pagefuse-0.1.0-py3-none-any.whl (16.1 kB view details)

Uploaded Python 3

File details

Details for the file pagefuse-0.1.0.tar.gz.

File metadata

  • Download URL: pagefuse-0.1.0.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pagefuse-0.1.0.tar.gz
Algorithm Hash digest
SHA256 886e50177e16a57ed74a6c471e7d341d41856e03f751ec247103855d6199873a
MD5 1b23ae86269408654084b2f88c2046d6
BLAKE2b-256 101691a0437e4babd731a6f92f9eb9b7cdb1a48307342869984b01e7ac65a7c3

See more details on using hashes here.

File details

Details for the file pagefuse-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pagefuse-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 16.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for pagefuse-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 702218e1e8341d981aa1f2d1957cd15a2d7cab5c4d62194e32ce647e42a2da33
MD5 9f3789ce74475090fa6f8f0916a2a530
BLAKE2b-256 e3e7fbc0092c4b8b68f1913d9bb95ab6b4ddbaecdea22dc84605d720c6477e69

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page