Skip to main content

A toolkit for rendering HTML to PDF using Chrome headless.

Project description

CPDFKit

While WKHTMLTOPDF and the pdf_kit have been fun for a while, it is time to grow up and use some real browser engine to render your HTML to PDF.

CPDFKit is a Python toolkit for rendering PDF documents using Chromium or Google Chrome in headless mode. It provides capabilities to generate PDFs from URLs, file paths, or directly from HTML content, with customizable options such as paper size, margins, and orientation.

It does not utilise the debugging port of chrome or selenium but subprocess to the call the binary directly.

Features

  • Render PDFs from URLs or file paths: Easily generate PDFs from web pages or local HTML files.
  • Direct HTML to PDF conversion: Convert HTML strings directly into PDF documents.
  • Customizable paper sizes: Supports standard paper sizes like A4, A3, and more, including ability to specify custom dimensions.
  • Adjustable margins: Set top, bottom, left, and right margins.
  • Landscape or Portrait orientation: Generate PDFs in your preferred orientation.
  • JavaScript delay execution: Add a delay before rendering to ensure JavaScript-heavy pages load completely.
  • Secure: Runs Chrome with sandboxing disabled for operational compatibility but ensures path and URL sanitization to prevent common security issues.

Installation

This toolkit requires a local installation of Google Chrome or Chromium. Ensure Chrome or Chromium is accessible in your system PATH or specify the path when using the toolkit.

From GitHub

  1. Clone the repository:
    git clone'https://github.com/codingcowde/cpdfkit.git
    
  2. Navigate to the cloned directory:
    cd CPDFKit
    

Via pip

pip install cpdfkit

Usage

Basic Example

Here is a simple example of how to generate a PDF from a URL and save it to a file:

from cpdfkit import generate_pdf

# Generate PDF from a URL and save it to 'output.pdf'
generate_pdf(
    url_or_path="https://example.com",
    output_path="output.pdf",
    format="A4",
    margin_top=1,
    margin_bottom=1,
    margin_left=1,
    margin_right=1,
    js_delay=2,
    landscape=False
)

HTML to PDF

You can also render PDFs directly from HTML strings:

html_content = """
<html>
<head><title>Sample PDF</title></head>
<body><h1>Welcome to PDF Rendering</h1><p>This is a simple HTML to PDF conversion example.</p></body>
</html>
"""

pdf_data = generate_pdf(
    html_string=html_content,
    format="A4",
    margin_top=0.5,
    margin_bottom=0.5,
    margin_left=0.5,
    margin_right=0.5,
    js_delay=0,
    landscape=True
)

# Save the PDF data to a file
with open("output_from_html.pdf", "wb") as file:
    file.write(pdf_data)

Configuration

You can customize the behavior by specifying various parameters:

  • format: Paper size (e.g., "A4", "Letter").
  • margin_top, margin_bottom, margin_left, margin_right: Margins in inches.
  • js_delay: Time in seconds to wait before rendering the page, useful for waiting on JavaScript execution.
  • landscape: Set to True for landscape orientation or False for portrait.

Contributing

Contributions are welcome! Please read the contributing guide located in CONTRIBUTING.md.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cpdfkit-0.1.6.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

cpdfkit-0.1.6-py3-none-any.whl (8.4 kB view details)

Uploaded Python 3

File details

Details for the file cpdfkit-0.1.6.tar.gz.

File metadata

  • Download URL: cpdfkit-0.1.6.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for cpdfkit-0.1.6.tar.gz
Algorithm Hash digest
SHA256 45bc366d6b4969c5a459d793efa5a2800f4f9c3afcad7f6eff8b7031e42fdade
MD5 32a0c5cb99ff07521f3731f4e492ec6b
BLAKE2b-256 dee8f021c61c8d992406d5b369f936c8df62518e5294fd0a9c4a3402a96a9dcc

See more details on using hashes here.

File details

Details for the file cpdfkit-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: cpdfkit-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 8.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for cpdfkit-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ab19154f7a9dce24c44c58dd05c41c8f4526e59e7360dae9ab901a0cce67c06d
MD5 5476ac4f7803f8299417715bdb8ab3d8
BLAKE2b-256 c46276ef904dc147b5b242035e41ef0e89da26cd0f47633f19bf3b71104420af

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page