Official Python SDK for the PDFCanon API

These details have not been verified by PyPI

Project links

Project description

PDFCanon Python SDK

Official Python SDK for the PDFCanon API — normalize, sanitize, and validate PDF files at scale. No third-party dependencies; requires Python 3.9+.

Requirements

Python 3.9 or later
No external dependencies (uses only the standard library)

Installation

pip install pdfcanon

Authentication

Obtain an API key from the PDFCanon portal. Pass it directly or set the PDFCANON_API_KEY environment variable:

export PDFCANON_API_KEY="pdfn_your_key_here"

The SDK sends the key as X-Api-Key: pdfn_… in every request. If you call the REST API directly, use the same header:

curl -H "X-Api-Key: pdfn_your_key_here" https://api.pdfcanon.com/api/submissions

Quickstart (Synchronous)

import pdfcanon

# API key is read from PDFCANON_API_KEY if not passed
client = pdfcanon.Client(api_key="pdfn_your_key_here")

with open("/path/to/document.pdf", "rb") as f:
    response = client.normalize(f, file_name="document.pdf")

print(f"Status: {response.status}")
print(f"Submission ID: {response.submission_id}")

Quickstart (Async)

import asyncio
import pdfcanon

async def main():
    client = pdfcanon.AsyncClient(api_key="pdfn_your_key_here")

    async with open("/path/to/document.pdf", "rb") as f:
        response = await client.normalize(f, file_name="document.pdf")

    print(f"Status: {response.status}")
    print(f"Submission ID: {response.submission_id}")

asyncio.run(main())

Async / Poll Flow

Large PDFs are processed asynchronously. Use wait_for_completion to poll until done:

import pdfcanon

client = pdfcanon.Client()  # reads PDFCANON_API_KEY from environment

with open("/path/to/document.pdf", "rb") as f:
    initial = client.normalize(f, file_name="document.pdf")

# Poll until processing completes (up to 120 seconds)
result = client.wait_for_completion(initial.submission_id, timeout=120.0)

if result.status == "SUCCESS":
    # Download the normalized PDF
    pdf_bytes = client.download_artifact(result.normalized.sha256)
    with open("/path/to/normalized.pdf", "wb") as out:
        out.write(pdf_bytes)

    print(f"Original size:   {result.original.size_bytes:,} bytes")
    print(f"Normalized size: {result.normalized.size_bytes:,} bytes")
    print(f"JavaScript removed: {result.security.javascript_removed}")
else:
    print(f"Failed: [{result.failure.code}] {result.failure.message}")

Webhook Flow

For production use, register a webhook endpoint instead of polling:

client = pdfcanon.Client()

with open("/path/to/document.pdf", "rb") as f:
    response = client.normalize(
        f,
        file_name="document.pdf",
        webhook_url="https://your-app.example.com/webhooks/pdfcanon",
        remove_annotations=True,
        idempotency_key="unique-key-per-document",
    )
# Returns a response with status PENDING or IN_PROGRESS;
# webhook fires when processing completes.
print(f"Queued with submission ID: {response.submission_id}")

Webhook Signature Verification

Verify incoming webhook signatures in your web framework:

# Flask example
from flask import Flask, request, abort
from pdfcanon.webhooks import verify_signature, InvalidSignatureError
import os

app = Flask(__name__)
WEBHOOK_SECRET = os.environ["PDFCANON_WEBHOOK_SECRET"]

@app.post("/webhooks/pdfcanon")
def handle_pdfcanon_webhook():
    raw_body = request.get_data(as_text=True)
    signature = request.headers.get("X-PDFCanon-Signature", "")

    try:
        verify_signature(raw_body, signature, WEBHOOK_SECRET)
    except InvalidSignatureError:
        abort(401, "Invalid webhook signature")

    event = request.get_json(force=True)
    event_type = event.get("event_type")

    if event_type == "pdf.normalized":
        sha256 = event["normalized_sha256"]
        print(f"PDF ready: {sha256}")
        # Download and store the normalized PDF...
    elif event_type == "pdf.failed":
        print(f"PDF failed: {event['failure']['message']}")

    return {"ok": True}

Configuration

import pdfcanon

client = pdfcanon.Client(
    api_key="pdfn_your_key_here",
    base_url="https://api.pdfcanon.com/api",  # Default
    connect_timeout=5.0,                      # Seconds; default: 5
    read_timeout=120.0,                       # Seconds; default: 120
    max_retries=3,                            # Default: 3
)

Error Handling

import pdfcanon
from pdfcanon import (
    AuthenticationError,
    PolicyRejectionError,
    RateLimitError,
    ToolchainError,
    NetworkError,
    PDFCanonError,
)

try:
    with open("/path/to/document.pdf", "rb") as f:
        result = client.normalize(f)
except AuthenticationError:
    print("Invalid API key or expired token")
except PolicyRejectionError as e:
    # 422: the PDF violates intake policy (encrypted, too large, etc.)
    print(f"PDF rejected: {e}")
except RateLimitError as e:
    # 429: monthly quota or rate limit exceeded
    print(f"Rate limited. Retry after {e.retry_after} seconds")
except ToolchainError as e:
    # 5xx: server-side processing failure
    print(f"Server error: {e}")
except NetworkError as e:
    # Timeout, DNS failure, etc.
    print(f"Network error: {e}")
except PDFCanonError as e:
    # Base class — catch all SDK errors
    print(f"Unexpected SDK error: {e}")

Error Reference

Exception	HTTP Status	When
`AuthenticationError`	401	Invalid or missing API key
`PolicyRejectionError`	422	PDF rejected by intake policy
`RateLimitError`	429	Monthly quota or rate limit exceeded
`ToolchainError`	5xx	Server-side processing failure
`NetworkError`	—	Network / timeout error
`PDFCanonError`	—	Base class for all SDK errors

Models Reference

Model	Key Fields
`NormalizeResponse`	`status`, `submission_id`, `original`, `normalized`, `security`, `validation`, `warnings`, `failure`
`OriginalInfo`	`sha256`, `size_bytes`
`NormalizedInfo`	`sha256`, `size_bytes`, `pdf_version`, `linearized`, `download_url`
`SecurityInfo`	`javascript_removed`, `open_actions_removed`, `embedded_files_removed`, ...
`ValidationInfo`	`xref_rebuilt`, `object_streams_regenerated`, `pdfa_compliant`, ...
`WarningInfo`	`code`, `message`
`FailureInfo`	`code`, `message`, `stage`

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.1

Apr 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfcanon-1.0.1.tar.gz (18.8 kB view details)

Uploaded Apr 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pdfcanon-1.0.1-py3-none-any.whl (18.3 kB view details)

Uploaded Apr 28, 2026 Python 3

File details

Details for the file pdfcanon-1.0.1.tar.gz.

File metadata

Download URL: pdfcanon-1.0.1.tar.gz
Upload date: Apr 28, 2026
Size: 18.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for pdfcanon-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`14252b2e8c7f560690fbfc18389429d8f8de30dc74ac5b401d82df5182629344`
MD5	`6695f1725b4be609177641a890108fd8`
BLAKE2b-256	`488316a1c4a2afd82ab859a95790e89692b98030d4a70b5d6262468316ea00cf`

See more details on using hashes here.

File details

Details for the file pdfcanon-1.0.1-py3-none-any.whl.

File metadata

Download URL: pdfcanon-1.0.1-py3-none-any.whl
Upload date: Apr 28, 2026
Size: 18.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.6

File hashes

Hashes for pdfcanon-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3c7f622b434475b9bd05387e8b17502643f09b1ba8fbaa3554394ff1e2bb6f9b`
MD5	`3358e6ded7cbb068cd2800cee34b08a9`
BLAKE2b-256	`acab411c98dab329f4943c0454f0af0e6697cd088b568e20585ca542569b2c61`

See more details on using hashes here.

pdfcanon 1.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PDFCanon Python SDK

Requirements

Installation

Authentication

Quickstart (Synchronous)

Quickstart (Async)

Async / Poll Flow

Webhook Flow

Webhook Signature Verification

Configuration

Error Handling

Error Reference

Models Reference

Further Reading

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes