Skip to main content

A Swiss-army knife for everyday PDF workflows – pure-Python, cross-platform, and fully typed.

Project description

pdf-tools

A Swiss-army knife for everyday PDF workflows – pure-Python, cross-platform, and fully typed.


Features

Capability Sub-command / API Notes
Convert images (.png, .jpg, …) or Word (.docx) to PDF pdf-tools convert …
pdf_tools.convert.service.convert_file_to_pdf()
Uses Pillow + img2pdf for images and LibreOffice / unoserver for Word files.
Merge multiple PDFs (or whole folders) pdf-tools merge …
pdf_tools.merge.service.merge_pdfs()
Preserves bookmarks; skips non-PDF inputs with a warning.
Process = Convert Merge in one go pdf-tools process convert-and-merge … Handy for ad-hoc batches of mixed file types.
Watermark (text stamp) pdf-tools watermark add-text …
pdf_tools.watermark.service.add_text_watermark()
PyMuPDF in-place editing; configurable font, colour, opacity, rotation, position.
Async-friendly CLI Built on Typer + custom AsyncTyper Callbacks can be async def – future-proof for parallel work.
Pydantic v2 models File, Files, WatermarkOptions, … JSON-serialisable contracts for easy automation.
Fully typed + Ruff + Mypy + pytest + hypothesis CI fails on lint, type, docs, or test issues.

Installation

pipx install unoserver --system-site-packages 
pip install pdf-tools

External dependency: LibreOffice must be installed and on your $PATH for Word→PDF conversion.

  1. Install unoserver globally using pipx install --system-site-packages (prefered) or sudo -H pip install
  2. use the bundled python that ships with LibreOffice, or
  3. call soffice --headless directly (see Batch listeners below).

CLI Quick Start

# 1. Convert a single Word file
pdf-tools convert file-to-pdf draft.docx

# 2. Convert every image in a folder → PDFs in ./out
pdf-tools convert folder-to-pdfs assets/ --output-dir out/

# 3. Merge selected PDFs
pdf-tools merge pdf-files a.pdf b.pdf c.pdf -o merged.pdf 

# 4. Merge *all* PDFs in a folder
pdf-tools merge pdfs-in-folder scans/ -o merged.pdf 

# 5. One-liner: convert images + docs → merge
pdf-tools process convert-and-merge-pdfs image1.jpg doc1.docx doc2.docx -o final.pdf 

# 6. Add a diagonal red DRAFT watermark on every page
pdf-tools watermark add-text src.pdf stamped.pdf \
    --text "DRAFT" --color "#FF0000" --font-size 72 --opacity 0.2 --rotation 45

Batch LibreOffice Listener (faster)

# spin up a listener for the whole session (Linux)
unoserver --interface 127.0.0.1 --port 2002 &
export LIBRE_PORT=2002   # used by convert helpers
pdf-tools process convert-and-merge-pdfs ...
kill %1                  # when done

Using the Python API

from pathlib import Path
from pdf_tools.convert.service import convert_file_to_pdf
from pdf_tools.merge.service import merge_pdfs
from pdf_tools.watermark.models import WatermarkOptions
from pdf_tools.watermark.service import add_text_watermark

# 1. Convert
img_pdf = convert_file_to_pdf(
    input_path=Path("diagram.png"),
    output_dir=Path("out"),
)

# 2. Merge two PDFs
merge_pdfs(
    input_paths=[Path("intro.pdf"), img_pdf.path],
    output_path=Path("bundle.pdf"),
)

# 3. Watermark (first page only)
opts = WatermarkOptions(text="CONFIDENTIAL", font_size=36, all_pages=False)
add_text_watermark(src=Path("bundle.pdf"), dst=Path("bundle_wm.pdf"), opts=opts)

Spinning up transient LibreOffice listener

from pdf_tools.convert.unoserver_ctx import unoserver_listener
from pdf_tools.process.service import convert_and_merge_pdfs

with unoserver_listener(port=2002):            # starts & auto-kills unoserver
    convert_and_merge_pdfs(
        input_paths=[Path("doc1.docx"), Path("pic.jpg")],
        output_path=Path("package.pdf"),
    )

Development setup

pipx install unoserver --system-site-packages
git clone https://github.com/your-org/pdf-tools.git
cd pdf-tools
poetry install --with dev   # include dev/test dependencies

# Run checks
ruff check .
mypy .
pytest -q

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdf_toolchest-0.1.13.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pdf_toolchest-0.1.13-py3-none-any.whl (26.9 kB view details)

Uploaded Python 3

File details

Details for the file pdf_toolchest-0.1.13.tar.gz.

File metadata

  • Download URL: pdf_toolchest-0.1.13.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pdf_toolchest-0.1.13.tar.gz
Algorithm Hash digest
SHA256 534e1799b19fbdef9afce059e3aee711a0cf9478a79841b9075386bd91f36d7f
MD5 763e4ec641f5e847f43899a982b113db
BLAKE2b-256 2347e6fc0ad0633db4c3eae4cc0fc5ebb350a1d68684ecf8cd9106700a24ad04

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdf_toolchest-0.1.13.tar.gz:

Publisher: publish.yml on aswann45/pdf-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file pdf_toolchest-0.1.13-py3-none-any.whl.

File metadata

  • Download URL: pdf_toolchest-0.1.13-py3-none-any.whl
  • Upload date:
  • Size: 26.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for pdf_toolchest-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 1e301478e4ff0a1a000599a277848dde7a095bf380ebc781357675963a8c4647
MD5 18e616e41593d86e7d8dcf4885e75459
BLAKE2b-256 a345710cf6c555597e9d0f6c1bccdff6eb49d7da02270e1bd8ddda7b10cdde7b

See more details on using hashes here.

Provenance

The following attestation bundles were made for pdf_toolchest-0.1.13-py3-none-any.whl:

Publisher: publish.yml on aswann45/pdf-tools

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page