Skip to main content

Python SDK for the Dokmatiq DocGen document generation API

Project description

Dokmatiq DocGen Python SDK

CI

Python SDK for the Dokmatiq DocGen document generation API. Generate PDF, DOCX, and ODT documents from HTML/Markdown with templates, e-invoicing (ZUGFeRD/XRechnung), digital signatures, and more.

Installation

pip install dokmatiq-docgen

Quick Start

from docgen import DocGen

dg = DocGen(api_key="dk_live_xxx")

# One-liner: HTML to PDF
pdf = dg.html_to_pdf("<h1>Hello World</h1>")

# Markdown to PDF
pdf = dg.markdown_to_pdf("# Report\n\n**Summary**: ...")

Builder Pattern

For complex documents, use the fluent builder:

from docgen import DocGen, ColumnDef, TableData, TextAlignment

with DocGen(api_key="dk_live_xxx") as dg:
    pdf = (dg.document()
        .html("<h1>Rechnung {{nr}}</h1>")
        .template("invoice.odt")
        .field("nr", "RE-2026-001")
        .field("datum", "12.04.2026")
        .table("positionen", TableData(
            columns=[
                ColumnDef("Artikel", width=80),
                ColumnDef("Preis", width=30, alignment=TextAlignment.RIGHT),
            ],
            rows=[["Widget", "9.99 €"], ["Gadget", "24.99 €"]],
        ))
        .qr_code("payment", "BCD\n002\n1\nSCT\n...")
        .watermark("ENTWURF")
        .as_pdf()
        .generate())

E-Invoicing (ZUGFeRD / XRechnung)

from docgen import DocGen, InvoiceUnit

with DocGen(api_key="dk_live_xxx") as dg:
    invoice = (dg.invoice()
        .number("RE-2026-001")
        .date("2026-04-12")
        .seller(name="ACME GmbH", street="Musterstr. 1", zip="10115", city="Berlin", vat_id="DE123456789")
        .buyer(name="Kunde AG", street="Kundenweg 5", zip="20095", city="Hamburg")
        .item("Beratung", quantity=8, unit=InvoiceUnit.HOUR, unit_price=120.0)
        .item("Reisekosten", unit_price=250.0)
        .bank(iban="DE89370400440532013000", bic="COBADEFFXXX", holder="ACME GmbH")
        .payment_terms("Zahlbar innerhalb 14 Tagen")
        .build())

    pdf = (dg.document()
        .html("<h1>Rechnung RE-2026-001</h1>")
        .template("invoice.odt")
        .invoice(invoice)
        .as_pdf()
        .generate())

PDF Operations

from pathlib import Path
from docgen import DocGen

with DocGen(api_key="dk_live_xxx") as dg:
    # Merge PDFs
    merged = dg.merge_pdfs([Path("part1.pdf"), Path("part2.pdf")])

    # Fill form fields
    filled = dg.fill_form(Path("form.pdf"), {"name": "Max", "date": "12.04.2026"})

    # Sign PDF
    signed = dg.sign_pdf(Path("doc.pdf"), "my-cert.p12", "password123")

    # Extract text
    text = dg.pdf_tools.extract_text(Path("document.pdf"))

    # Preview page as image
    png = dg.preview.preview_page(Path("document.pdf"), page=1, dpi=300)

Async Support

import asyncio
from docgen import AsyncDocGen

async def main():
    async with AsyncDocGen(api_key="dk_live_xxx") as dg:
        # Parallel generation
        pdfs = await asyncio.gather(
            dg.html_to_pdf("<h1>Doc A</h1>"),
            dg.html_to_pdf("<h1>Doc B</h1>"),
        )

asyncio.run(main())

Async Jobs

with DocGen(api_key="dk_live_xxx") as dg:
    # Submit async job
    job = dg.documents.generate_async(request)

    # Poll until complete (with timeout)
    pdf = dg.documents.wait_for_job(job.job_id, poll_interval=2.0, timeout=120.0)

Configuration

from docgen import DocGen, RetryPolicy

dg = DocGen(
    api_key="dk_live_xxx",
    base_url="https://api.dokmatiq.com",
    timeout=60.0,
    retry=RetryPolicy(
        max_retries=5,
        initial_delay=1.0,
        backoff_multiplier=2.0,
    ),
    validate_mode="strict",  # "strict" | "warn" | None
)

Error Handling

from docgen import DocGen, ValidationError, RateLimitError, AuthenticationError

try:
    pdf = dg.html_to_pdf("<h1>Hello</h1>")
except AuthenticationError:
    print("Invalid API key")
except ValidationError as e:
    print(f"Validation failed: {e}")
    print(f"Field errors: {e.field_errors}")
    print(f"Hint: {e.hint}")
except RateLimitError as e:
    print(f"Rate limited. Retry after {e.retry_after}s")

Webhook Verification

from docgen import verify_webhook

# In your webhook handler:
payload = verify_webhook(request.body, request.headers["X-DocGen-Signature"], secret)
print(f"Job {payload.job_id} completed: {payload.status}")

Excel Workbooks

from docgen import DocGen

with DocGen(api_key="dk_live_xxx") as dg:
    # Generate XLSX from structured data
    xlsx = dg.excel.generate({
        "sheets": [{
            "name": "Sales",
            "columns": [
                {"header": "Month", "width": 15},
                {"header": "Revenue", "width": 12, "format": "#,##0.00 €"},
            ],
            "rows": [
                {"values": ["January", 42500.0]},
                {"values": ["February", 38900.0]},
            ],
        }],
    })

    # CSV → XLSX
    from_csv = dg.excel.from_csv(csv_content)

    # XLSX → JSON
    data = dg.excel.to_json(excel_base64)

Receipt Recognition (AI)

Extract structured data from receipts, tickets, and invoices using AI vision:

from pathlib import Path
from docgen import DocGen

with DocGen(api_key="dk_live_xxx") as dg:
    # Extract data from a receipt image
    file_bytes = Path("kassenbeleg.jpg").read_bytes()
    result = dg.receipts.extract(file_bytes, "kassenbeleg.jpg", "image/jpeg")
    print(result["receiptData"]["totalAmount"])   # 42.50
    print(result["receiptData"]["receiptType"])    # "cash_receipt"
    print(result["receiptData"]["skr03Account"])   # "4650"

    # Export as DATEV-compatible CSV
    csv = dg.receipts.export_csv([result["receiptData"]])

    # Async extraction with webhook
    job = dg.receipts.extract_async(
        file_bytes, "beleg.jpg", "image/jpeg",
        callback_url="https://my-app.com/webhooks/receipts",
        callback_secret="my-secret",
    )

Note: Requires AI processing consent in the Developer Portal settings (GDPR).

Sub-Clients

Client Access Description
dg.documents Document generation, compose, async jobs
dg.templates Template upload, list, delete
dg.fonts Font upload, list, delete
dg.pdf_forms Form field inspection and filling
dg.signatures Certificate management, sign, verify
dg.pdf_tools Merge, split, metadata, PDF/A, rotate
dg.preview Page rendering as images
dg.zugferd ZUGFeRD embed, extract, validate
dg.xrechnung XRechnung generate, parse, validate, transform
dg.excel Excel workbook generation and conversion
dg.receipts AI-powered receipt/ticket extraction and export

Requirements

  • Python 3.10+
  • httpx >= 0.27

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dokmatiq_docgen-0.1.0.tar.gz (30.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dokmatiq_docgen-0.1.0-py3-none-any.whl (43.9 kB view details)

Uploaded Python 3

File details

Details for the file dokmatiq_docgen-0.1.0.tar.gz.

File metadata

  • Download URL: dokmatiq_docgen-0.1.0.tar.gz
  • Upload date:
  • Size: 30.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for dokmatiq_docgen-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4320d3fb0ab471dab13568f653a4b70ecffa2fb2f0153aa3eb357bd5908dd5f3
MD5 9c923c2f9f7b3dd2bc9bdc5af1fcfc39
BLAKE2b-256 d8d7e37163de04e54c4f53e129f4aa96db29b245782bbe2f62767c08dfcfacc3

See more details on using hashes here.

File details

Details for the file dokmatiq_docgen-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for dokmatiq_docgen-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4232ff15a28107d0ef5617738c6653eb10b09a80f27bf211d20431fe2f165a1c
MD5 fe8453b2040adcd45a78ed336d0f4607
BLAKE2b-256 72957a90c6158c59fdeeee4e0a1280ea631a84e74a7e0270b8b67046c69f6e0a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page