Python bindings for gopdfsuit - PDF generation, merging, splitting, and more
Project description
pypdfsuit
Python bindings for gopdfsuit - a comprehensive PDF library for generation, merging, splitting, form filling, and HTML to PDF/Image conversion.
Features
- PDF Generation: Create PDFs from structured templates with tables, images, and styled text
- PDF Merging: Combine multiple PDFs into a single document
- PDF Splitting: Split PDFs by pages, ranges, or maximum pages per file
- Form Filling: Fill PDF forms using XFDF data
- HTML to PDF: Convert HTML content or URLs to PDF documents
- HTML to Image: Convert HTML content or URLs to images (PNG, JPG, SVG)
- PDF Redaction: Securely redact sensitive information using coordinates or text search
Installation
From Source
- Build the shared library:
cd bindings/python
chmod +x build.sh
./build.sh
- Install the Python package:
pip install .
Requirements
- Python 3.8+
- Go 1.22+ (for building the shared library)
- Chrome/Chromium (for HTML to PDF/Image conversion)
Quick Start
Generate a PDF
from pypdfsuit import generate_pdf, PDFTemplate, Config, Title, Element, Table, Row, Cell
template = PDFTemplate(
config=Config(page="A4", page_alignment=1),
title=Title(
props="Helvetica:24:100:center:0:0:0:0",
text="My Document"
),
elements=[
Element(
type="table",
table=Table(
max_columns=2,
column_widths=[1.0, 1.0],
rows=[
Row(row=[
Cell(props="Helvetica:12:100:left:1:1:1:1", text="Name"),
Cell(props="Helvetica:12:000:left:1:1:1:1", text="John Doe"),
])
]
)
)
]
)
pdf_bytes = generate_pdf(template)
with open("output.pdf", "wb") as f:
f.write(pdf_bytes)
Merge PDFs
from pypdfsuit import merge_pdfs
with open("doc1.pdf", "rb") as f1, open("doc2.pdf", "rb") as f2:
merged = merge_pdfs([f1.read(), f2.read()])
with open("merged.pdf", "wb") as f:
f.write(merged)
Split a PDF
from pypdfsuit import split_pdf, SplitSpec
with open("document.pdf", "rb") as f:
pdf_data = f.read()
# Split specific pages
spec = SplitSpec(pages=[1, 3, 5])
parts = split_pdf(pdf_data, spec)
# Or split every 5 pages
spec = SplitSpec(max_per_file=5)
parts = split_pdf(pdf_data, spec)
for i, part in enumerate(parts):
with open(f"part_{i+1}.pdf", "wb") as f:
f.write(part)
Convert HTML to PDF
from pypdfsuit import convert_html_to_pdf, HtmlToPDFRequest
# Convert HTML string
request = HtmlToPDFRequest(
html="<html><body><h1>Hello World</h1></body></html>",
page_size="A4",
orientation="Portrait",
)
pdf_bytes = convert_html_to_pdf(request)
# Or convert a URL
request = HtmlToPDFRequest(
url="https://example.com",
page_size="Letter",
)
pdf_bytes = convert_html_to_pdf(request)
Fill a PDF Form
from pypdfsuit import fill_pdf_with_xfdf
with open("form.pdf", "rb") as f:
pdf_data = f.read()
with open("data.xfdf", "rb") as f:
xfdf_data = f.read()
filled = fill_pdf_with_xfdf(pdf_data, xfdf_data)
with open("filled.pdf", "wb") as f:
f.write(filled)
Redact a PDF
from pypdfsuit import apply_redactions_advanced
with open("document.pdf", "rb") as f:
pdf_data = f.read()
redacted = apply_redactions_advanced(pdf_data, {
"blocks": [
{"pageNum": 1, "x": 120, "y": 620, "width": 180, "height": 24}
],
"textSearch": [
{"text": "Confidential"}
],
"mode": "visual_allowed"
})
with open("redacted.pdf", "wb") as f:
f.write(redacted)
API Reference
Types
PDFTemplate- Main template structure for PDF generationConfig- Page configuration (size, orientation, security, etc.)Title- Document title sectionTable,Row,Cell- Table structureElement- Generic element (table, spacer, image)Image,Spacer- Additional elementsSecurityConfig- Encryption settingsPDFAConfig- PDF/A compliance settingsSignatureConfig- Digital signature settingsHtmlToPDFRequest- HTML to PDF conversion optionsHtmlToImageRequest- HTML to image conversion optionsSplitSpec- PDF split specificationFontInfo- Font information
Functions
generate_pdf(template: PDFTemplate) -> bytesget_available_fonts() -> List[FontInfo]merge_pdfs(pdf_files: List[bytes]) -> bytessplit_pdf(pdf_data: bytes, spec: SplitSpec) -> List[bytes]parse_page_spec(spec: str, total_pages: int = 0) -> List[int]fill_pdf_with_xfdf(pdf_data: bytes, xfdf_data: bytes) -> bytesconvert_html_to_pdf(request: HtmlToPDFRequest) -> bytesconvert_html_to_image(request: HtmlToImageRequest) -> bytesget_page_info(pdf_data: bytes) -> dictextract_text_positions(pdf_data: bytes, page_num: int) -> list[dict]find_text_occurrences(pdf_data: bytes, text: str) -> list[dict]apply_redactions(pdf_data: bytes, redactions: list[dict]) -> bytesapply_redactions_advanced(pdf_data: bytes, options: dict) -> bytes
Props String Format
The props string format for cells and titles is:
FontName:FontSize:StyleCode:Alignment:BorderLeft:BorderRight:BorderTop:BorderBottom
- FontName: Helvetica, Courier, Times-Roman, etc.
- FontSize: Integer size in points
- StyleCode: 3 digits for bold(1/0), italic(1/0), underline(1/0). e.g., "100" = bold only
- Alignment: left, center, right
- Borders: 1 = border, 0 = no border
Example: "Helvetica:12:100:center:1:1:1:1" = Helvetica 12pt, bold, centered, all borders
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pypdfsuit-5.0.0.tar.gz.
File metadata
- Download URL: pypdfsuit-5.0.0.tar.gz
- Upload date:
- Size: 8.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ecc7a84e3d8cbe4ca4792d33da018e9f2d01498969a8d466b6cb23e8f1659c3f
|
|
| MD5 |
d6be4d0fec22b922bc6a43e89da6161a
|
|
| BLAKE2b-256 |
b9f1ac1060c32070de3a8e10704a80115e0f5522dab03c3aee08fccdbab2069e
|
File details
Details for the file pypdfsuit-5.0.0-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: pypdfsuit-5.0.0-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 8.6 MB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f06a88a07cbd58dcd40a1b8bd9acc86dfa7023d868ce3d691fd611083453de04
|
|
| MD5 |
335d05674cf38ec8ba4b55cb233ac557
|
|
| BLAKE2b-256 |
8d29746f326346866ede487b5e11858a9021273e6d557ff9f831514108fb2398
|
File details
Details for the file pypdfsuit-5.0.0-cp312-cp312-macosx_15_0_universal2.whl.
File metadata
- Download URL: pypdfsuit-5.0.0-cp312-cp312-macosx_15_0_universal2.whl
- Upload date:
- Size: 4.5 MB
- Tags: CPython 3.12, macOS 15.0+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eab1e0c52f89642361131bd0daedf95647e3abb49e57f07a43943016c7d4abda
|
|
| MD5 |
121e4e85078c4e7d6ffeb62834b3fc97
|
|
| BLAKE2b-256 |
beab386bb898e3b1f7b7ee84cd9bace86370243bd2a5b1835efe8acad4f8335c
|