Fuse any document, any format — PDF, DOCX, PPTX, images and more
Project description
PageFuse
Fuse any document, any format. Combine pages from PDFs, Word docs, PowerPoint slides, images, and more into a single output.
Supported Input Formats
| Format | Extensions | Requires LibreOffice |
|---|---|---|
.pdf |
No | |
| Images | .png, .jpg, .jpeg, .tiff |
No |
| Word | .docx, .doc |
Yes |
| PowerPoint | .pptx, .ppt |
Yes |
| OpenDocument | .odt, .odp, .ods |
Yes |
| Other | .rtf, .html, .csv, .xlsx, .xls |
Yes |
Supported Output Formats
Output format is determined by the file extension in your config or command:
| Format | Extension | Requires LibreOffice | Notes |
|---|---|---|---|
.pdf |
No | Default — fast, lossless | |
| Image | .png, .jpg, .tiff |
No | Single page → file; multi-page → ZIP |
| Word | .docx, .odt |
Yes | |
| PowerPoint | .pptx, .odp |
Yes | |
| Web | .html |
Yes |
Installation
# Linux (recommended — avoids system Python restrictions)
pipx install pagefuse
# macOS
pip install pagefuse
# Windows
pip install pagefuse
# Or inside a virtual environment (any platform)
python3 -m venv venv && source venv/bin/activate
pip install pagefuse
Linux note: If you see
error: externally-managed-environment, usepipxinstead ofpip. Install pipx with:sudo apt install pipx && pipx ensurepath
Uninstall
pipx uninstall pagefuse # if installed via pipx
pip uninstall pagefuse # if installed via pip
LibreOffice is required only for Office/OpenDocument/HTML formats. PDF and image formats work on all platforms without it.
# Ubuntu / Debian
sudo apt install libreoffice
# macOS
brew install --cask libreoffice
# Windows
# Download from https://www.libreoffice.org/download and add soffice.exe to PATH
Usage
Quick assembly (no config file)
# Assemble into a PDF
pagefuse quick output.pdf cover.pdf:1 terms.docx:all pricing.pdf:1-3 slides.pptx:2,4,6
# Assemble and export as Word document
pagefuse quick output.docx cover.pdf:1 terms.docx:all
# Assemble and export as images (multi-page → output.zip)
pagefuse quick output.png report.pdf:1-3
Each source is file:pages. Omit :pages to include all pages.
Config file assembly
pagefuse assemble board_pack.fuse
Example board_pack.fuse:
# Output format is determined by the extension (.pdf, .docx, .html, .png, …)
# Add multiple output: lines to export to several formats in one run.
output: board_pack.pdf
output: board_pack.docx
output: board_pack_preview.png
# Metadata (title defaults to output filename if omitted)
title: Q4 Board Pack
author: Finance Team
subject: Board meeting materials
file: templates/cover_letter.pdf 1
file: reports/financial_data.docx all
file: slides/main_deck.pptx 1-4
file: reports/charts.pdf 3,5,7
file: templates/signature_page.pdf 1
Generate a template config:
pagefuse init # assemble config (default) → config.fuse
pagefuse init --split # split config → config.fuse
pagefuse init --output my.fuse # custom filename
pagefuse init --split --output split.fuse
Split a document into parts
Inline (no config file):
pagefuse split report.pdf cover.pdf:1 body.pdf:2-10 appendix.pdf:11-20
# Each output can be a different format
pagefuse split report.pdf summary.pdf:1 full.docx:all preview.png:1
Or use a .fuse config file:
pagefuse split split.fuse
Example split.fuse:
source: annual_report.pdf
output: cover.pdf 1
output: executive_summary.pdf 2-5
output: financials.pdf 6-20
output: appendix.docx 21-30
output: cover_preview.png 1
Each output is file:pages. Omit :pages to copy all pages.
Inspect a document
pagefuse info report.pdf
pagefuse info slides.pptx
pagefuse info photo.png
Page Specification Syntax
| Spec | Meaning |
|---|---|
all |
Every page |
5 |
Page 5 only |
1-3 |
Pages 1 through 3 |
1,3,5 |
Pages 1, 3, and 5 |
1-3,5,7-9 |
Mixed ranges and singles |
Page numbers are 1-based.
Development
git clone https://github.com/raptorgold14/pagefuse.git
cd pagefuse
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt
pip install -e .
Run tests:
pytest
See examples/ for sample .fuse configs and examples/generate_pdfs.py to regenerate fixture files.
Roadmap
| Phase | Scope | Status |
|---|---|---|
| 1 — CLI MVP | PDF, DOCX, PPTX, images, OpenDocument, multi-format output | ✅ Done |
| 2 — Validate | Launch, gather feedback, iterate | 🔜 Next |
| 3 — Native formats | Pure Python DOCX/PPTX extraction (no LibreOffice dep) | Planned |
| 4 — GUI | Tauri-based GUI wrapping the CLI | Planned |
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pagefuse-0.1.0.tar.gz.
File metadata
- Download URL: pagefuse-0.1.0.tar.gz
- Upload date:
- Size: 21.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
886e50177e16a57ed74a6c471e7d341d41856e03f751ec247103855d6199873a
|
|
| MD5 |
1b23ae86269408654084b2f88c2046d6
|
|
| BLAKE2b-256 |
101691a0437e4babd731a6f92f9eb9b7cdb1a48307342869984b01e7ac65a7c3
|
File details
Details for the file pagefuse-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pagefuse-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
702218e1e8341d981aa1f2d1957cd15a2d7cab5c4d62194e32ce647e42a2da33
|
|
| MD5 |
9f3789ce74475090fa6f8f0916a2a530
|
|
| BLAKE2b-256 |
e3e7fbc0092c4b8b68f1913d9bb95ab6b4ddbaecdea22dc84605d720c6477e69
|