No project description provided

These details have not been verified by PyPI

Project links

Repository

Project description

Extend Python Library

The Extend Python library provides convenient, typed access to the Extend API — enabling you to parse, extract, classify, split, and edit documents with a few lines of code.

Installation

pip install extend-ai

Requires Python 3.8+

Quick start

Parse any document in three lines:

from extend_ai import Extend

client = Extend(token="YOUR_API_KEY")

result = client.parse(file={"url": "https://example.com/invoice.pdf"})

for chunk in result.output.chunks:
    print(chunk.content)

client.parse is synchronous — it sends the file, waits for processing, and returns a fully populated ParseRun with parsed chunks ready to use. The same pattern works for every capability:

# Extract structured data
extract_run = client.extract(
    file={"url": "https://example.com/invoice.pdf"},
    extractor={"id": "ex_abc123"},
)

# Classify a document
classify_run = client.classify(
    file={"url": "https://example.com/document.pdf"},
    classifier={"id": "cls_abc123"},
)

# Split a multi-document file
split_run = client.split(
    file={"url": "https://example.com/packet.pdf"},
    splitter={"id": "spl_abc123"},
)

# Edit a PDF with instructions
edit_run = client.edit(
    file={"url": "https://example.com/form.pdf"},
    config={"instructions": "Fill out the applicant name as Jane Doe"},
)

Note: The synchronous methods above have a 5-minute timeout and are best suited for onboarding and testing. For production workloads, use polling helpers or webhooks instead.

Polling helpers

Every run resource exposes a create_and_poll() method that creates the run and automatically polls until it reaches a terminal state (PROCESSED, FAILED, or CANCELLED):

from extend_ai import Extend

client = Extend(token="YOUR_API_KEY")

result = client.extract_runs.create_and_poll(
    file={"url": "https://example.com/invoice.pdf"},
    extractor={"id": "ex_abc123"},
)

if result.status == "PROCESSED":
    print(result.output)
else:
    print(f"Failed: {result.failure_message}")

This works across all run types:

parse_run     = client.parse_runs.create_and_poll(file={"url": "..."})
extract_run   = client.extract_runs.create_and_poll(file={"url": "..."}, extractor={"id": "..."})
classify_run  = client.classify_runs.create_and_poll(file={"url": "..."}, classifier={"id": "..."})
split_run     = client.split_runs.create_and_poll(file={"url": "..."}, splitter={"id": "..."})
workflow_run  = client.workflow_runs.create_and_poll(file={"url": "..."}, workflow={"id": "..."})
edit_run      = client.edit_runs.create_and_poll(file={"url": "..."})

Custom polling options

from extend_ai import Extend, PollingOptions

result = client.extract_runs.create_and_poll(
    file={"url": "https://example.com/invoice.pdf"},
    extractor={"id": "ex_abc123"},
    polling_options=PollingOptions(
        max_wait_ms=300_000,       # 5 minute timeout (default: no timeout)
        initial_delay_ms=1_000,    # start with 1s delay (default)
        max_delay_ms=60_000,       # cap at 60s delay (default: 30s)
    ),
)

Running workflows

Workflows chain multiple processing steps (extraction, classification, splitting, etc.) into a single pipeline. Run a workflow by passing a workflow ID and a file:

result = client.workflow_runs.create_and_poll(
    file={"url": "https://example.com/invoice.pdf"},
    workflow={"id": "workflow_abc123"},
)

print(result.status)  # "PROCESSED"

for step_run in result.step_runs or []:
    print(step_run.step.type)   # "EXTRACT", "CLASSIFY", etc.
    print(step_run.result)

Webhook verification

Verify and parse incoming webhook events using the built-in utilities. Known event types are returned as typed Pydantic models; unknown or future event types fall back to a plain dict so your handler keeps working without SDK updates.

from extend_ai import Extend

client = Extend(token="YOUR_API_KEY")

def handle_webhook(request):
    event = client.webhooks.verify_and_parse(
        body=request.body.decode(),
        headers=dict(request.headers),
        signing_secret="wss_your_signing_secret",
    )

    # Works for both typed model and dict fallback
    event_type = getattr(event, "event_type", None) or event.get("eventType")
    payload = getattr(event, "payload", None) or event.get("payload")

    match event_type:
        case "extract_run.processed":
            run_id = getattr(payload, "id", None) or payload.get("id")
            print(f"Extraction complete: {run_id}")
        case "workflow_run.completed":
            run_id = getattr(payload, "id", None) or payload.get("id")
            print(f"Workflow complete: {run_id}")
        case _:
            print(f"Received event: {event_type}")

Manual verification & parsing

# Verify signature without parsing
is_valid = client.webhooks.verify(body, headers, signing_secret)

# Parse without verification (not recommended for production)
event = client.webhooks.parse(body)

Signed URL payloads

For large payloads, Extend may send a signed URL instead of the full payload. Use allow_signed_url=True, then check and fetch when needed:

event = client.webhooks.verify_and_parse(
    body=body,
    headers=headers,
    signing_secret=signing_secret,
    allow_signed_url=True,
)

if client.webhooks.is_signed_url_event(event):
    full_event = client.webhooks.fetch_signed_payload_sync(event)
    # full_event is typed or dict; use getattr(..., None) or .get() as in the example above
else:
    # Normal inline payload — handle event directly
    ...

Async support

Every method has an async counterpart via AsyncExtend:

import asyncio
from extend_ai import AsyncExtend

client = AsyncExtend(token="YOUR_API_KEY")

async def main():
    result = await client.parse(file={"url": "https://example.com/invoice.pdf"})

    for chunk in result.output.chunks:
        print(chunk.content)

asyncio.run(main())

Async polling works the same way:

result = await client.extract_runs.create_and_poll(
    file={"url": "https://example.com/invoice.pdf"},
    extractor={"id": "ex_abc123"},
)

Exception handling

The SDK raises typed exceptions for API errors:

from extend_ai.core.api_error import ApiError

try:
    result = client.parse(file={"url": "https://example.com/invoice.pdf"})
except ApiError as e:
    print(e.status_code)  # 400, 401, 404, 429, etc.
    print(e.body)

Specific error classes are available for fine-grained handling:

from extend_ai.errors import (
    BadRequestError,         # 400
    UnauthorizedError,       # 401
    PaymentRequiredError,    # 402
    ForbiddenError,          # 403
    NotFoundError,           # 404
    UnprocessableEntityError,# 422
    TooManyRequestsError,    # 429
    InternalServerError,     # 500
)

Polling timeout

When create_and_poll() exceeds its timeout, a PollingTimeoutError is raised:

from extend_ai import PollingTimeoutError

try:
    result = client.extract_runs.create_and_poll(
        file={"url": "..."},
        extractor={"id": "..."},
        polling_options=PollingOptions(max_wait_ms=60_000),
    )
except PollingTimeoutError as e:
    print(f"Timed out after {e.elapsed_ms}ms (limit: {e.max_wait_ms}ms)")

Pagination

List endpoints return paginated results using next_page_token:

# First page
response = client.extract_runs.list(max_page_size=10)

for run in response.data:
    print(f"{run.id}: {run.status}")

# Next page
if response.next_page_token:
    next_page = client.extract_runs.list(
        max_page_size=10,
        next_page_token=response.next_page_token,
    )

Environments

The SDK defaults to the US production environment. Other regions are available:

from extend_ai import Extend, ExtendEnvironment

# US (default)
client = Extend(token="YOUR_API_KEY")

# US2 (HIPAA)
client = Extend(token="YOUR_API_KEY", environment=ExtendEnvironment.PRODUCTION_US2)

# EU
client = Extend(token="YOUR_API_KEY", environment=ExtendEnvironment.PRODUCTION_EU1)

# Custom base URL
client = Extend(token="YOUR_API_KEY", base_url="https://custom-api.example.com")

Advanced

Retries

The SDK automatically retries failed requests with exponential backoff. Retries are triggered for:

408 Timeout
429 Too Many Requests
5xx Server Errors

# Override retries for a single request
client.extract_runs.create(..., request_options={"max_retries": 0})

Timeouts

The default timeout is 300 seconds. Override globally or per-request:

# Global timeout
client = Extend(token="YOUR_API_KEY", timeout=30.0)

# Per-request timeout
client.extract_runs.create(..., request_options={"timeout_in_seconds": 60})

Custom headers

client = Extend(
    token="YOUR_API_KEY",
    headers={"X-Custom-Header": "value"},
)

Custom HTTP client

Pass a pre-configured httpx.Client for full control over transport:

import httpx
from extend_ai import Extend

client = Extend(
    token="YOUR_API_KEY",
    httpx_client=httpx.Client(
        proxy="http://my.test.proxy.example.com",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
)

Raw responses

Access the underlying HTTP response for any request:

raw_response = client.with_raw_response.parse(file={"url": "https://example.com/invoice.pdf"})

print(raw_response.status_code)
print(raw_response.headers)
print(raw_response.data)  # ParseRun

Documentation

Full API reference documentation is available at docs.extend.ai.

A complete SDK reference is available in reference.md.

Contributing

While we value open-source contributions to this SDK, this library is generated programmatically. Additions made directly to this library would have to be moved over to our generation code, otherwise they would be overwritten upon the next generated release. Feel free to open a PR as a proof of concept, but know that we will not be able to merge it as-is. We suggest opening an issue first to discuss with us!

On the other hand, contributions to the README are always very welcome!

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

1.9.0

Apr 13, 2026

1.8.0

Apr 10, 2026

1.7.0

Apr 1, 2026

1.6.0

Mar 23, 2026

1.5.0

Mar 20, 2026

1.4.1

Mar 17, 2026

1.4.0

Mar 10, 2026

1.3.0

Mar 9, 2026

This version

1.2.0

Feb 13, 2026

1.1.0

Feb 13, 2026

1.0.3

Feb 10, 2026

1.0.2

Feb 10, 2026

1.0.1

Feb 10, 2026

1.0.0

Feb 10, 2026

0.2.0

Mar 20, 2026

0.1.0

Feb 13, 2026

0.0.21

Feb 8, 2026

0.0.20

Feb 6, 2026

0.0.19

Jan 27, 2026

0.0.18

Jan 16, 2026

0.0.17

Jan 12, 2026

0.0.16

Jan 7, 2026

0.0.15

Dec 16, 2025

0.0.14

Nov 13, 2025

0.0.12

Oct 30, 2025

0.0.11

Oct 10, 2025

0.0.10

Oct 7, 2025

0.0.9

Oct 2, 2025

0.0.8

Sep 26, 2025

0.0.7

Sep 18, 2025

0.0.6

Sep 4, 2025

0.0.4

Aug 14, 2025

0.0.3

Jun 10, 2025

0.0.2b0 pre-release

May 19, 2025

0.0.1b0 pre-release

May 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extend_ai-1.2.0.tar.gz (307.5 kB view details)

Uploaded Feb 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

extend_ai-1.2.0-py3-none-any.whl (739.1 kB view details)

Uploaded Feb 13, 2026 Python 3

File details

Details for the file extend_ai-1.2.0.tar.gz.

File metadata

Download URL: extend_ai-1.2.0.tar.gz
Upload date: Feb 13, 2026
Size: 307.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.5.1 CPython/3.9.25 Linux/6.14.0-1017-azure

File hashes

Hashes for extend_ai-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`588e092bb47a65041060849c161d6924adfdb2f745252d72c1d2f38e53e92a66`
MD5	`ccb19a2910bc02e5f8496eae7a314b9b`
BLAKE2b-256	`7cdb7b400d94d64f6a78e9cb1679fa297c0cc1a6b06d9cee0743d7e417d1a5e3`

See more details on using hashes here.

File details

Details for the file extend_ai-1.2.0-py3-none-any.whl.

File metadata

Download URL: extend_ai-1.2.0-py3-none-any.whl
Upload date: Feb 13, 2026
Size: 739.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.5.1 CPython/3.9.25 Linux/6.14.0-1017-azure

File hashes

Hashes for extend_ai-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7aeadea2c7adb0d8dac5f883d70ed28ed12cbcf07de24d435a6c5be974ff165a`
MD5	`65cbff1a41faab77b20f9516f886e5c1`
BLAKE2b-256	`29db69b119cfb57962d1bcb5a9175ee30714c366b81d7f7399f9b0aa01b13e69`

See more details on using hashes here.

extend-ai 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Extend Python Library

Installation

Quick start

Polling helpers

Custom polling options

Running workflows

Webhook verification

Manual verification & parsing

Signed URL payloads

Async support

Exception handling

Polling timeout

Pagination

Environments

Advanced

Retries

Timeouts

Custom headers

Custom HTTP client

Raw responses

Documentation

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes