Skip to main content

Enterprise PDF SDK — render, extract, annotate, sign, and validate PDFs. Pure Rust, zero system dependencies.

Project description

pdfluent

Enterprise PDF SDK for Python — built on a pure-Rust stack, zero system dependencies.

Render pages, extract text, fill forms, annotate, redact, encrypt, merge, and validate PDF/A — all from a single pip install.

Installation

pip install pdfluent

# Optional extras
pip install pdfluent[pillow]   # PIL Image support
pip install pdfluent[numpy]    # NumPy array support

Requires Python ≥ 3.8. Pre-built wheels for Linux (x86_64, aarch64), macOS (x86_64, arm64), and Windows (x86_64).

Quick Start

from pdfluent import Document

# Open, inspect, render
with Document("invoice.pdf") as doc:
    print(f"{doc.page_count} pages — {doc.metadata.title}")

    img = doc[0].render(dpi=150)
    img.save("page_0.png")          # requires Pillow

# Extract text
doc = Document("report.pdf")
for page in doc:
    print(page.extract_text())

# Fill a form field and save
doc = Document("form.pdf")
doc.set_form_field("Name", "Jane Doe")
doc.save("form_filled.pdf")

# Search-and-redact
doc = Document("contract.pdf")
report = doc.redact_text("Confidential")
print(f"Redacted {report.areas_redacted} areas on {report.pages_affected} pages")
doc.save("contract_redacted.pdf")

# PDF/A validation
from pdfluent import validate_pdfa

report = validate_pdfa("archive.pdf")
if report.is_compliant:
    print(f"✓ {report.pdfa_level} compliant")
else:
    for issue in report.issues:
        print(f"[{issue.severity}] {issue.rule}: {issue.message}")

# Merge PDFs
from pdfluent import merge_pdfs
merge_pdfs(["a.pdf", "b.pdf", "c.pdf"], "merged.pdf")

# Encrypt / decrypt
doc = Document("sensitive.pdf")
doc.encrypt("sensitive_enc.pdf", password="s3cr3t")

from pdfluent import decrypt_pdf
decrypt_pdf("sensitive_enc.pdf", "sensitive_dec.pdf", password="s3cr3t")

Features

Feature Description
Render Pages to RGBA pixels, PIL Images, or NumPy arrays at any DPI
Text extraction Plain text or structured TextBlock/TextSpan with position
Text search Find pages containing a query string
Forms (AcroForm) Read and fill text, checkbox, and dropdown fields
Annotations Read existing annotations; add highlights and free-text notes
Redaction Search-and-redact: black-box all occurrences of a string
Encryption AES-256 (PDF 2.0) encrypt/decrypt with user + owner passwords
Merge / split Merge multiple PDFs; split into individual pages (via page slicing)
PDF/A validation Validate against PDF/A-1B, 2B, 3B with issue-level reporting
Metadata Read title, author, subject, keywords, creator, producer
Bookmarks Traverse the document outline tree
Thumbnails Fast downscaled preview images

API Overview

Document(source, password=None)

Opens a PDF from a file path (str) or raw bytes.

doc = Document("file.pdf")             # from path
doc = Document(open("file.pdf","rb").read())  # from bytes
doc = Document("encrypted.pdf", password="pw")

Properties: page_count, metadata, bookmarks Methods: render_all(dpi), search(query), extract_text(page_num), save(path), get_form_fields(), set_form_field(name, value), get_annotations(page), add_annotation(page, type, rect, content), redact_text(term, page=None), encrypt(path, password), decrypt(path, password) Protocols: len(doc), doc[0], for page in doc, with Document(...) as doc

Page

Properties: index, width, height, rotation, geometry Methods: render(dpi, width, height, background), thumbnail(max_dimension), extract_text(), extract_text_blocks()

RenderedImage

Properties: width, height, pixels (raw RGBA bytes) Methods: to_pil(), to_numpy(), save(path)

Module-level functions

Function Description
open_pdf(path, password=None) Alias for Document(path)
merge_pdfs(paths, output) Merge a list of PDFs
validate_pdfa(path)ComplianceReport Run PDF/A validation
decrypt_pdf(input, output, password) Decrypt to a new file

Comparison

pdfluent pypdf pdfminer pdfplumber pikepdf
Rendering ✓ (via pdfminer)
Text extraction
Form fill
Redaction
Encryption ✓ (AES-256)
PDF/A validation
Native deps none none none none libqpdf
Language Rust Python Python Python C++

Building from Source

Requires a Rust toolchain and maturin.

pip install maturin
# Source available to licensed customers — see https://pdfluent.com/pricing
maturin develop --release          # install in current venv
maturin build --release            # build wheel in ./dist/

License

PDFluent is proprietary, commercial software distributed under the PDFluent Commercial License.

Pre-built wheels (pip install pdfluent) are available for evaluation. Production use requires a valid license. See the LICENSE file included in this distribution for full terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pdfluent-1.0.0b6-cp312-cp312-macosx_10_12_x86_64.whl (5.7 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

pdfluent-1.0.0b6-cp311-cp311-macosx_11_0_arm64.whl (5.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

File details

Details for the file pdfluent-1.0.0b6-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pdfluent-1.0.0b6-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 2ffae5fa7f8e78fe33f5f09007ad25a35da6145bfb64518046e0ddf7902e9bf7
MD5 128cb561aa12702b193bea32d5c98489
BLAKE2b-256 1419aed50efb57397bce825028da01bbca18c38a23280d6585f476bbe407f067

See more details on using hashes here.

File details

Details for the file pdfluent-1.0.0b6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pdfluent-1.0.0b6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 06506c8e477cb2920cc91deef6edf1549148e2b902aa01b37b33143c6b4bbfcd
MD5 a8b494548b5b8ee26494e30f34e93833
BLAKE2b-256 60125d58864bd32b4fa69600327ff4d69ef63b3220cd8e4f66c3e9b38ea58250

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page