Enterprise PDF SDK — render, extract, annotate, sign, and validate PDFs. Pure Rust, zero system dependencies.
Project description
pdfluent
Enterprise PDF SDK for Python — built on a pure-Rust stack, zero system dependencies.
Render pages, extract text, fill forms, annotate, redact, encrypt, merge, and validate PDF/A — all from a single pip install.
Installation
pip install pdfluent
# Optional extras
pip install pdfluent[pillow] # PIL Image support
pip install pdfluent[numpy] # NumPy array support
Requires Python ≥ 3.8. Pre-built wheels for Linux (x86_64, aarch64), macOS (x86_64, arm64), and Windows (x86_64).
Quick Start
from pdfluent import Document
# Open, inspect, render
with Document("invoice.pdf") as doc:
print(f"{doc.page_count} pages — {doc.metadata.title}")
img = doc[0].render(dpi=150)
img.save("page_0.png") # requires Pillow
# Extract text
doc = Document("report.pdf")
for page in doc:
print(page.extract_text())
# Fill a form field and save
doc = Document("form.pdf")
doc.set_form_field("Name", "Jane Doe")
doc.save("form_filled.pdf")
# Search-and-redact
doc = Document("contract.pdf")
report = doc.redact_text("Confidential")
print(f"Redacted {report.areas_redacted} areas on {report.pages_affected} pages")
doc.save("contract_redacted.pdf")
# PDF/A validation
from pdfluent import validate_pdfa
report = validate_pdfa("archive.pdf")
if report.is_compliant:
print(f"✓ {report.pdfa_level} compliant")
else:
for issue in report.issues:
print(f"[{issue.severity}] {issue.rule}: {issue.message}")
# Merge PDFs
from pdfluent import merge_pdfs
merge_pdfs(["a.pdf", "b.pdf", "c.pdf"], "merged.pdf")
# Encrypt / decrypt
doc = Document("sensitive.pdf")
doc.encrypt("sensitive_enc.pdf", password="s3cr3t")
from pdfluent import decrypt_pdf
decrypt_pdf("sensitive_enc.pdf", "sensitive_dec.pdf", password="s3cr3t")
Features
| Feature | Description |
|---|---|
| Render | Pages to RGBA pixels, PIL Images, or NumPy arrays at any DPI |
| Text extraction | Plain text or structured TextBlock/TextSpan with position |
| Text search | Find pages containing a query string |
| Forms (AcroForm) | Read and fill text, checkbox, and dropdown fields |
| Annotations | Read existing annotations; add highlights and free-text notes |
| Redaction | Search-and-redact: black-box all occurrences of a string |
| Encryption | AES-256 (PDF 2.0) encrypt/decrypt with user + owner passwords |
| Merge / split | Merge multiple PDFs; split into individual pages (via page slicing) |
| PDF/A validation | Validate against PDF/A-1B, 2B, 3B with issue-level reporting |
| Metadata | Read title, author, subject, keywords, creator, producer |
| Bookmarks | Traverse the document outline tree |
| Thumbnails | Fast downscaled preview images |
API Overview
Document(source, password=None)
Opens a PDF from a file path (str) or raw bytes.
doc = Document("file.pdf") # from path
doc = Document(open("file.pdf","rb").read()) # from bytes
doc = Document("encrypted.pdf", password="pw")
Properties: page_count, metadata, bookmarks
Methods: render_all(dpi), search(query), extract_text(page_num), save(path),
get_form_fields(), set_form_field(name, value), get_annotations(page),
add_annotation(page, type, rect, content), redact_text(term, page=None),
encrypt(path, password), decrypt(path, password)
Protocols: len(doc), doc[0], for page in doc, with Document(...) as doc
Page
Properties: index, width, height, rotation, geometry
Methods: render(dpi, width, height, background), thumbnail(max_dimension),
extract_text(), extract_text_blocks()
RenderedImage
Properties: width, height, pixels (raw RGBA bytes)
Methods: to_pil(), to_numpy(), save(path)
Module-level functions
| Function | Description |
|---|---|
open_pdf(path, password=None) |
Alias for Document(path) |
merge_pdfs(paths, output) |
Merge a list of PDFs |
validate_pdfa(path) → ComplianceReport |
Run PDF/A validation |
decrypt_pdf(input, output, password) |
Decrypt to a new file |
Comparison
| pdfluent | pypdf | pdfminer | pdfplumber | pikepdf | |
|---|---|---|---|---|---|
| Rendering | ✓ | – | – | ✓ (via pdfminer) | – |
| Text extraction | ✓ | ✓ | ✓ | ✓ | – |
| Form fill | ✓ | ✓ | – | – | ✓ |
| Redaction | ✓ | – | – | – | ✓ |
| Encryption | ✓ (AES-256) | ✓ | – | – | ✓ |
| PDF/A validation | ✓ | – | – | – | – |
| Native deps | none | none | none | none | libqpdf |
| Language | Rust | Python | Python | Python | C++ |
Building from Source
Requires a Rust toolchain and maturin.
pip install maturin
# Source available to licensed customers — see https://pdfluent.com/pricing
maturin develop --release # install in current venv
maturin build --release # build wheel in ./dist/
License
PDFluent is proprietary, commercial software distributed under the PDFluent Commercial License.
- Pricing & licensing
- Commercial enquiries: hello@pdfluent.com
Pre-built wheels (pip install pdfluent) are available for evaluation. Production use requires a valid license. See the LICENSE file included in this distribution for full terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pdfluent-1.0.0b6-cp312-cp312-macosx_10_12_x86_64.whl.
File metadata
- Download URL: pdfluent-1.0.0b6-cp312-cp312-macosx_10_12_x86_64.whl
- Upload date:
- Size: 5.7 MB
- Tags: CPython 3.12, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ffae5fa7f8e78fe33f5f09007ad25a35da6145bfb64518046e0ddf7902e9bf7
|
|
| MD5 |
128cb561aa12702b193bea32d5c98489
|
|
| BLAKE2b-256 |
1419aed50efb57397bce825028da01bbca18c38a23280d6585f476bbe407f067
|
File details
Details for the file pdfluent-1.0.0b6-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: pdfluent-1.0.0b6-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 5.2 MB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06506c8e477cb2920cc91deef6edf1549148e2b902aa01b37b33143c6b4bbfcd
|
|
| MD5 |
a8b494548b5b8ee26494e30f34e93833
|
|
| BLAKE2b-256 |
60125d58864bd32b4fa69600327ff4d69ef63b3220cd8e4f66c3e9b38ea58250
|