Skip to main content

Python SDK for the Dokmatiq DocGen document generation API

Project description

Dokmatiq DocGen Python SDK

CI

Python SDK for the Dokmatiq DocGen document generation API. Generate PDF, DOCX, and ODT documents from HTML/Markdown with templates, e-invoicing (ZUGFeRD/XRechnung), digital signatures, and more.

Installation

pip install dokmatiq-docgen

Quick Start

from docgen import DocGen

dg = DocGen(api_key="dk_live_xxx")

# One-liner: HTML to PDF
pdf = dg.html_to_pdf("<h1>Hello World</h1>")

# Markdown to PDF
pdf = dg.markdown_to_pdf("# Report\n\n**Summary**: ...")

Builder Pattern

For complex documents, use the fluent builder:

from docgen import DocGen, ColumnDef, TableData, TextAlignment

with DocGen(api_key="dk_live_xxx") as dg:
    pdf = (dg.document()
        .html("<h1>Rechnung {{nr}}</h1>")
        .template("invoice.odt")
        .field("nr", "RE-2026-001")
        .field("datum", "12.04.2026")
        .table("positionen", TableData(
            columns=[
                ColumnDef("Artikel", width=80),
                ColumnDef("Preis", width=30, alignment=TextAlignment.RIGHT),
            ],
            rows=[["Widget", "9.99 €"], ["Gadget", "24.99 €"]],
        ))
        .qr_code("payment", "BCD\n002\n1\nSCT\n...")
        .watermark("ENTWURF")
        .as_pdf()
        .generate())

E-Invoicing (ZUGFeRD / XRechnung)

from docgen import DocGen, InvoiceUnit

with DocGen(api_key="dk_live_xxx") as dg:
    invoice = (dg.invoice()
        .number("RE-2026-001")
        .date("2026-04-12")
        .seller(name="ACME GmbH", street="Musterstr. 1", zip="10115", city="Berlin", vat_id="DE123456789")
        .buyer(name="Kunde AG", street="Kundenweg 5", zip="20095", city="Hamburg")
        .item("Beratung", quantity=8, unit=InvoiceUnit.HOUR, unit_price=120.0)
        .item("Reisekosten", unit_price=250.0)
        .bank(iban="DE89370400440532013000", bic="COBADEFFXXX", holder="ACME GmbH")
        .payment_terms("Zahlbar innerhalb 14 Tagen")
        .build())

    pdf = (dg.document()
        .html("<h1>Rechnung RE-2026-001</h1>")
        .template("invoice.odt")
        .invoice(invoice)
        .as_pdf()
        .generate())

PDF Operations

from pathlib import Path
from docgen import DocGen

with DocGen(api_key="dk_live_xxx") as dg:
    # Merge PDFs
    merged = dg.merge_pdfs([Path("part1.pdf"), Path("part2.pdf")])

    # Fill form fields
    filled = dg.fill_form(Path("form.pdf"), {"name": "Max", "date": "12.04.2026"})

    # Sign PDF
    signed = dg.sign_pdf(Path("doc.pdf"), "my-cert.p12", "password123")

    # Extract text
    text = dg.pdf_tools.extract_text(Path("document.pdf"))

    # Preview page as image
    png = dg.preview.preview_page(Path("document.pdf"), page=1, dpi=300)

Async Support

import asyncio
from docgen import AsyncDocGen

async def main():
    async with AsyncDocGen(api_key="dk_live_xxx") as dg:
        # Parallel generation
        pdfs = await asyncio.gather(
            dg.html_to_pdf("<h1>Doc A</h1>"),
            dg.html_to_pdf("<h1>Doc B</h1>"),
        )

asyncio.run(main())

Async Jobs

with DocGen(api_key="dk_live_xxx") as dg:
    # Submit async job
    job = dg.documents.generate_async(request)

    # Poll until complete (with timeout)
    pdf = dg.documents.wait_for_job(job.job_id, poll_interval=2.0, timeout=120.0)

Configuration

from docgen import DocGen, RetryPolicy

dg = DocGen(
    api_key="dk_live_xxx",
    base_url="https://api.dokmatiq.com",
    timeout=60.0,
    retry=RetryPolicy(
        max_retries=5,
        initial_delay=1.0,
        backoff_multiplier=2.0,
    ),
    validate_mode="strict",  # "strict" | "warn" | None
)

Error Handling

from docgen import DocGen, ValidationError, RateLimitError, AuthenticationError

try:
    pdf = dg.html_to_pdf("<h1>Hello</h1>")
except AuthenticationError:
    print("Invalid API key")
except ValidationError as e:
    print(f"Validation failed: {e}")
    print(f"Field errors: {e.field_errors}")
    print(f"Hint: {e.hint}")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")

Webhook Verification

from docgen import verify_webhook

# In your webhook handler:
payload = verify_webhook(request.body, request.headers["X-DocGen-Signature"], secret)
print(f"Job {payload.job_id} completed: {payload.status}")

Excel Workbooks

from docgen import DocGen

with DocGen(api_key="dk_live_xxx") as dg:
    # Generate XLSX from structured data
    xlsx = dg.excel.generate({
        "sheets": [{
            "name": "Sales",
            "columns": [
                {"header": "Month", "width": 15},
                {"header": "Revenue", "width": 12, "format": "#,##0.00 €"},
            ],
            "rows": [
                {"values": ["January", 42500.0]},
                {"values": ["February", 38900.0]},
            ],
        }],
    })

    # CSV → XLSX
    from_csv = dg.excel.from_csv(csv_content)

    # XLSX → JSON
    data = dg.excel.to_json(excel_base64)

Receipt Recognition (AI)

Extract structured data from receipts, tickets, and invoices using AI vision:

from pathlib import Path
from docgen import DocGen

with DocGen(api_key="dk_live_xxx") as dg:
    # Extract data from a receipt image
    file_bytes = Path("kassenbeleg.jpg").read_bytes()
    result = dg.receipts.extract(file_bytes, "kassenbeleg.jpg", "image/jpeg")
    print(result["receiptData"]["totalAmount"])   # 42.50
    print(result["receiptData"]["receiptType"])    # "cash_receipt"
    print(result["receiptData"]["skr03Account"])   # "4650"

    # Export as DATEV-compatible CSV
    csv = dg.receipts.export_csv([result["receiptData"]])

    # Async extraction with webhook
    job = dg.receipts.extract_async(
        file_bytes, "beleg.jpg", "image/jpeg",
        callback_url="https://my-app.com/webhooks/receipts",
        callback_secret="my-secret",
    )

Note: Requires AI processing consent in the Developer Portal settings (GDPR).

Sub-Clients

Client Access Description
dg.documents Document generation, compose, async jobs
dg.templates Template upload, list, delete
dg.fonts Font upload, list, delete
dg.pdf_forms Form field inspection and filling
dg.signatures Certificate management, sign, verify
dg.pdf_tools Merge, split, metadata, PDF/A, rotate
dg.preview Page rendering as images
dg.zugferd ZUGFeRD embed, extract, validate
dg.xrechnung XRechnung generate, parse, validate, transform
dg.excel Excel workbook generation and conversion
dg.receipts AI-powered receipt/ticket extraction and export

Requirements

  • Python 3.10+
  • httpx >= 0.27

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dokmatiq_docgen-0.1.1.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dokmatiq_docgen-0.1.1-py3-none-any.whl (44.0 kB view details)

Uploaded Python 3

File details

Details for the file dokmatiq_docgen-0.1.1.tar.gz.

File metadata

  • Download URL: dokmatiq_docgen-0.1.1.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dokmatiq_docgen-0.1.1.tar.gz
Algorithm Hash digest
SHA256 17ede1b486d0a1a1d71abc6283ecd99f10d5d65b7ee75217e2a405a8c5c50850
MD5 9787ba53bc49dd890cb7e7ce2d58f782
BLAKE2b-256 011d351c248c3c88ed6ca04935d5c4b3e9b1757c1cd31f21b3ea5b71c709c3d0

See more details on using hashes here.

File details

Details for the file dokmatiq_docgen-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for dokmatiq_docgen-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 daa0b11d7ea6c09caffdd9af3b56f5a5b8f8d0daf863f7cb2b2c68cb60f998e4
MD5 cb107f6b545ac2f1cea976b4a72353f1
BLAKE2b-256 41adea42c50eed043a8c70deb00e27d5c339b808bf1f682902fc6b7ec0bd4103

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page