Skip to main content

Python bindings for gopdfsuit - PDF generation, merging, splitting, and more

Project description

pypdfsuit

Python bindings for gopdfsuit - a comprehensive PDF library for generation, merging, splitting, form filling, and HTML to PDF/Image conversion.

Features

  • PDF Generation: Create PDFs from structured templates with tables, images, and styled text
  • PDF Merging: Combine multiple PDFs into a single document
  • PDF Splitting: Split PDFs by pages, ranges, or maximum pages per file
  • Form Filling: Fill PDF forms using XFDF data
  • HTML to PDF: Convert HTML content or URLs to PDF documents
  • HTML to Image: Convert HTML content or URLs to images (PNG, JPG, SVG)
  • PDF Redaction: Securely redact sensitive information using coordinates or text search

Installation

From Source

  1. Build the shared library:
cd bindings/python
chmod +x build.sh
./build.sh
  1. Install the Python package:
pip install .

Requirements

  • Python 3.8+
  • Go 1.22+ (for building the shared library)
  • Chrome/Chromium (for HTML to PDF/Image conversion)

Quick Start

Generate a PDF

from pypdfsuit import generate_pdf, PDFTemplate, Config, Title, Element, Table, Row, Cell

template = PDFTemplate(
    config=Config(page="A4", page_alignment=1),
    title=Title(
        props="Helvetica:24:100:center:0:0:0:0",
        text="My Document"
    ),
    elements=[
        Element(
            type="table",
            table=Table(
                max_columns=2,
                column_widths=[1.0, 1.0],
                rows=[
                    Row(row=[
                        Cell(props="Helvetica:12:100:left:1:1:1:1", text="Name"),
                        Cell(props="Helvetica:12:000:left:1:1:1:1", text="John Doe"),
                    ])
                ]
            )
        )
    ]
)

pdf_bytes = generate_pdf(template)
with open("output.pdf", "wb") as f:
    f.write(pdf_bytes)

Merge PDFs

from pypdfsuit import merge_pdfs

with open("doc1.pdf", "rb") as f1, open("doc2.pdf", "rb") as f2:
    merged = merge_pdfs([f1.read(), f2.read()])

with open("merged.pdf", "wb") as f:
    f.write(merged)

Split a PDF

from pypdfsuit import split_pdf, SplitSpec

with open("document.pdf", "rb") as f:
    pdf_data = f.read()

# Split specific pages
spec = SplitSpec(pages=[1, 3, 5])
parts = split_pdf(pdf_data, spec)

# Or split every 5 pages
spec = SplitSpec(max_per_file=5)
parts = split_pdf(pdf_data, spec)

for i, part in enumerate(parts):
    with open(f"part_{i+1}.pdf", "wb") as f:
        f.write(part)

Convert HTML to PDF

from pypdfsuit import convert_html_to_pdf, HtmlToPDFRequest

# Convert HTML string
request = HtmlToPDFRequest(
    html="<html><body><h1>Hello World</h1></body></html>",
    page_size="A4",
    orientation="Portrait",
)
pdf_bytes = convert_html_to_pdf(request)

# Or convert a URL
request = HtmlToPDFRequest(
    url="https://example.com",
    page_size="Letter",
)
pdf_bytes = convert_html_to_pdf(request)

Fill a PDF Form

from pypdfsuit import fill_pdf_with_xfdf

with open("form.pdf", "rb") as f:
    pdf_data = f.read()
with open("data.xfdf", "rb") as f:
    xfdf_data = f.read()

filled = fill_pdf_with_xfdf(pdf_data, xfdf_data)
with open("filled.pdf", "wb") as f:
    f.write(filled)

Redact a PDF

from pypdfsuit import apply_redactions_advanced

with open("document.pdf", "rb") as f:
    pdf_data = f.read()

redacted = apply_redactions_advanced(pdf_data, {
    "blocks": [
        {"pageNum": 1, "x": 120, "y": 620, "width": 180, "height": 24}
    ],
    "textSearch": [
        {"text": "Confidential"}
    ],
    "mode": "visual_allowed"
})

with open("redacted.pdf", "wb") as f:
    f.write(redacted)

API Reference

Types

  • PDFTemplate - Main template structure for PDF generation
  • Config - Page configuration (size, orientation, security, etc.)
  • Title - Document title section
  • Table, Row, Cell - Table structure
  • Element - Generic element (table, spacer, image)
  • Image, Spacer - Additional elements
  • SecurityConfig - Encryption settings
  • PDFAConfig - PDF/A compliance settings
  • SignatureConfig - Digital signature settings
  • HtmlToPDFRequest - HTML to PDF conversion options
  • HtmlToImageRequest - HTML to image conversion options
  • SplitSpec - PDF split specification
  • FontInfo - Font information

Functions

  • generate_pdf(template: PDFTemplate) -> bytes
  • get_available_fonts() -> List[FontInfo]
  • merge_pdfs(pdf_files: List[bytes]) -> bytes
  • split_pdf(pdf_data: bytes, spec: SplitSpec) -> List[bytes]
  • parse_page_spec(spec: str, total_pages: int = 0) -> List[int]
  • fill_pdf_with_xfdf(pdf_data: bytes, xfdf_data: bytes) -> bytes
  • convert_html_to_pdf(request: HtmlToPDFRequest) -> bytes
  • convert_html_to_image(request: HtmlToImageRequest) -> bytes
  • get_page_info(pdf_data: bytes) -> dict
  • extract_text_positions(pdf_data: bytes, page_num: int) -> list[dict]
  • find_text_occurrences(pdf_data: bytes, text: str) -> list[dict]
  • apply_redactions(pdf_data: bytes, redactions: list[dict]) -> bytes
  • apply_redactions_advanced(pdf_data: bytes, options: dict) -> bytes

Props String Format

The props string format for cells and titles is:

FontName:FontSize:StyleCode:Alignment:BorderLeft:BorderRight:BorderTop:BorderBottom
  • FontName: Helvetica, Courier, Times-Roman, etc.
  • FontSize: Integer size in points
  • StyleCode: 3 digits for bold(1/0), italic(1/0), underline(1/0). e.g., "100" = bold only
  • Alignment: left, center, right
  • Borders: 1 = border, 0 = no border

Example: "Helvetica:12:100:center:1:1:1:1" = Helvetica 12pt, bold, centered, all borders

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pypdfsuit-5.0.0.tar.gz (8.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pypdfsuit-5.0.0-cp312-cp312-win_amd64.whl (8.6 MB view details)

Uploaded CPython 3.12Windows x86-64

pypdfsuit-5.0.0-cp312-cp312-macosx_15_0_universal2.whl (4.5 MB view details)

Uploaded CPython 3.12macOS 15.0+ universal2 (ARM64, x86-64)

File details

Details for the file pypdfsuit-5.0.0.tar.gz.

File metadata

  • Download URL: pypdfsuit-5.0.0.tar.gz
  • Upload date:
  • Size: 8.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pypdfsuit-5.0.0.tar.gz
Algorithm Hash digest
SHA256 ecc7a84e3d8cbe4ca4792d33da018e9f2d01498969a8d466b6cb23e8f1659c3f
MD5 d6be4d0fec22b922bc6a43e89da6161a
BLAKE2b-256 b9f1ac1060c32070de3a8e10704a80115e0f5522dab03c3aee08fccdbab2069e

See more details on using hashes here.

File details

Details for the file pypdfsuit-5.0.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: pypdfsuit-5.0.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 8.6 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pypdfsuit-5.0.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 f06a88a07cbd58dcd40a1b8bd9acc86dfa7023d868ce3d691fd611083453de04
MD5 335d05674cf38ec8ba4b55cb233ac557
BLAKE2b-256 8d29746f326346866ede487b5e11858a9021273e6d557ff9f831514108fb2398

See more details on using hashes here.

File details

Details for the file pypdfsuit-5.0.0-cp312-cp312-macosx_15_0_universal2.whl.

File metadata

File hashes

Hashes for pypdfsuit-5.0.0-cp312-cp312-macosx_15_0_universal2.whl
Algorithm Hash digest
SHA256 eab1e0c52f89642361131bd0daedf95647e3abb49e57f07a43943016c7d4abda
MD5 121e4e85078c4e7d6ffeb62834b3fc97
BLAKE2b-256 beab386bb898e3b1f7b7ee84cd9bace86370243bd2a5b1835efe8acad4f8335c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page