Retab official python library

These details have not been verified by PyPI

Project links

Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- MacOS
- POSIX :: Linux
Programming Language
- Python :: 3

Project description

Retab Python SDK

Official Python SDK for Retab document extraction.

Installation

pip install retab

The client reads RETAB_API_KEY from the environment by default.

Quick Start

import os

from retab import Retab

client = Retab(api_key=os.environ["RETAB_API_KEY"])

invoice_schema = {
    "type": "object",
    "properties": {
        "invoice_number": {"type": "string"},
        "invoice_date": {"type": "string"},
        "total_amount": {"type": "number"},
    },
    "required": ["invoice_number", "total_amount"],
}

result = client.documents.extract(
    json_schema=invoice_schema,
    document="invoice.pdf",
    model="retab-micro",
)

print(result.data)
print(result.text)
print(result.likelihoods)
print(result.extraction_id)

documents.extract(...) returns a RetabParsedChatCompletion.

result.data is the parsed structured output
result.text is the raw JSON string
result.likelihoods mirrors the extracted structure with confidence signals
result.extraction_id can be used with the extractions API later

What `extract` Accepts

json_schema can be:

a Python dict
a path to a JSON schema file

document can be:

a local file path
a file-like object
a URL
MIMEData

Useful extraction options:

n_consensus: run multiple passes and reconcile the result
image_resolution_dpi: control image rendering quality for vision models
metadata: attach your own tags for later filtering
additional_messages: add extra instructions or context after the document content

Async Extraction

import os

from retab import AsyncRetab


async def main() -> None:
    client = AsyncRetab(api_key=os.environ["RETAB_API_KEY"])

    async with client:
        result = await client.documents.extract(
            json_schema={
                "type": "object",
                "properties": {
                    "booking_reference": {"type": "string"},
                    "guest_name": {"type": "string"},
                },
            },
            document="booking-confirmation.pdf",
            model="retab-micro",
        )

    print(result.data)

Streaming Extraction

extract_stream(...) yields partial RetabParsedChatCompletion objects as the JSON fills in.

from retab import Retab

client = Retab()

with client.documents.extract_stream(
    json_schema={
        "type": "object",
        "properties": {
            "invoice_number": {"type": "string"},
            "total_amount": {"type": "number"},
        },
    },
    document="invoice.pdf",
    model="retab-micro",
) as stream:
    for partial in stream:
        print(partial.data)

For async code:

async with client.documents.extract_stream(
    json_schema=invoice_schema,
    document="invoice.pdf",
    model="retab-micro",
) as stream:
    async for partial in stream:
        print(partial.data)

Adding Context with `additional_messages`

The SDK supports the same message structure used in the tests: plain text messages, system or developer guidance, and multipart content.

result = client.documents.extract(
    json_schema=invoice_schema,
    document="invoice.pdf",
    model="retab-micro",
    additional_messages=[
        {
            "role": "developer",
            "content": "Extract values exactly as written. Do not normalize vendor names.",
        },
        {
            "role": "user",
            "content": "Focus on invoice number, invoice date, and total amount due.",
        },
    ],
)

Working with Stored Extractions

Every extraction can be retrieved later through client.extractions.

result = client.documents.extract(
    json_schema=invoice_schema,
    document="invoice.pdf",
    model="retab-micro",
    metadata={"batch_id": "march-2026"},
)

stored = client.extractions.get(result.extraction_id)
print(stored.predictions)

page_sources = client.extractions.sources(result.extraction_id)
print(page_sources.sources)

recent = client.extractions.list(limit=20, metadata={"batch_id": "march-2026"})
for item in recent.items:
    print(item.id, item.file.filename)

client.extractions.download(...) returns a pre-signed download URL for jsonl, csv, or xlsx exports.

Workflows

The Python SDK also supports workflow discovery, execution, and step inspection.

from pathlib import Path

from retab import Retab

client = Retab()

workflow = client.workflows.get_entities("wf_abc123")
document_start_id = workflow.start_nodes[0].id

run = client.workflows.runs.create(
    workflow_id=workflow.workflow.id,
    documents={document_start_id: Path("invoice.pdf")},
)

run = client.workflows.runs.wait_for_completion(run.id, poll_interval_seconds=1.0)
run.raise_for_status()

print(run.output)

step = client.workflows.runs.steps.get(run.id, "extract-node-id")
print(step.extracted_data)

Useful workflow helpers:

client.workflows.get_entities(workflow_id) returns the workflow graph and exposes .start_nodes and .start_json_nodes
client.workflows.runs.wait_for_completion(run.id) polls until the run reaches completed, error, or cancelled
client.workflows.runs.steps.get(run.id, node_id) returns typed handle inputs and outputs
client.workflows.runs.steps.get_all(run) fetches step outputs for every node in one call
client.workflows.blocks.* and client.workflows.edges.* let you create or update workflow graphs from code

Notes

n_consensus=1 is the fastest option
higher n_consensus usually improves robustness on noisy or ambiguous documents
if schema validation fails, result.choices[0].message.parsed may be None

Project details

These details have not been verified by PyPI

Project links

Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- MacOS
- POSIX :: Linux
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.142

May 19, 2026

0.0.141

May 15, 2026

0.0.140

May 13, 2026

0.0.139

May 13, 2026

0.0.138

May 12, 2026

0.0.137

May 11, 2026

0.0.136

May 10, 2026

0.0.135

May 5, 2026

0.0.134

May 1, 2026

0.0.133

Apr 30, 2026

0.0.132

Apr 24, 2026

0.0.131

Apr 24, 2026

0.0.130

Apr 24, 2026

0.0.129

Apr 24, 2026

0.0.128

Apr 23, 2026

0.0.127

Apr 23, 2026

0.0.126

Apr 22, 2026

0.0.125

Apr 20, 2026

0.0.124

Apr 20, 2026

0.0.123

Apr 18, 2026

0.0.122

Apr 18, 2026

0.0.121

Apr 17, 2026

0.0.120

Apr 17, 2026

0.0.119

Apr 17, 2026

0.0.118

Apr 16, 2026

0.0.117

Apr 16, 2026

0.0.116

Apr 16, 2026

0.0.115

Apr 16, 2026

0.0.114

Mar 25, 2026

0.0.113

Mar 23, 2026

0.0.112

Mar 21, 2026

0.0.111

Mar 21, 2026

This version

0.0.110

Mar 20, 2026

0.0.109

Mar 15, 2026

0.0.108

Mar 13, 2026

0.0.107

Mar 12, 2026

0.0.106

Mar 12, 2026

0.0.105

Mar 12, 2026

0.0.104

Mar 9, 2026

0.0.103

Mar 9, 2026

0.0.102

Mar 3, 2026

0.0.101

Feb 26, 2026

0.0.100

Feb 23, 2026

0.0.99

Feb 22, 2026

0.0.98

Feb 21, 2026

0.0.97

Feb 20, 2026

0.0.96

Feb 4, 2026

0.0.95

Feb 1, 2026

0.0.94

Feb 1, 2026

0.0.93

Feb 1, 2026

0.0.92

Feb 1, 2026

0.0.91

Jan 27, 2026

0.0.90

Jan 22, 2026

0.0.89

Jan 15, 2026

0.0.88

Jan 14, 2026

0.0.87

Jan 13, 2026

0.0.86

Jan 13, 2026

0.0.85

Jan 2, 2026

0.0.84

Jan 2, 2026

0.0.83

Jan 2, 2026

0.0.82

Jan 2, 2026

0.0.81

Jan 2, 2026

0.0.80

Dec 30, 2025

0.0.79

Dec 21, 2025

0.0.78

Dec 21, 2025

0.0.77

Dec 16, 2025

0.0.76

Dec 11, 2025

0.0.75

Dec 11, 2025

0.0.74

Dec 10, 2025

0.0.73

Dec 10, 2025

0.0.72

Nov 28, 2025

0.0.71

Nov 28, 2025

0.0.70

Nov 27, 2025

0.0.69

Nov 26, 2025

0.0.68

Oct 29, 2025

0.0.67

Oct 28, 2025

0.0.66

Oct 27, 2025

0.0.64

Sep 30, 2025

0.0.63

Sep 11, 2025

0.0.62

Sep 2, 2025

0.0.61

Aug 30, 2025

0.0.60

Aug 28, 2025

0.0.59

Aug 26, 2025

0.0.58

Aug 17, 2025

0.0.57

Aug 16, 2025

0.0.55

Aug 15, 2025

0.0.54

Aug 15, 2025

0.0.51

Aug 14, 2025

0.0.49

Jul 22, 2025

0.0.47

Jul 22, 2025

0.0.46

Jul 21, 2025

0.0.45

Jul 17, 2025

0.0.44

Jul 16, 2025

0.0.43

Jul 11, 2025

0.0.42

Jun 28, 2025

0.0.41

Jun 26, 2025

0.0.40

Jun 26, 2025

0.0.39

Jun 26, 2025

0.0.38

Jun 24, 2025

0.0.37

Jun 24, 2025

0.0.36

Jun 11, 2025

0.0.35

Jun 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

retab-0.0.110.tar.gz (135.3 kB view details)

Uploaded Mar 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

retab-0.0.110-py3-none-any.whl (153.9 kB view details)

Uploaded Mar 20, 2026 Python 3

File details

Details for the file retab-0.0.110.tar.gz.

File metadata

Download URL: retab-0.0.110.tar.gz
Upload date: Mar 20, 2026
Size: 135.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for retab-0.0.110.tar.gz
Algorithm	Hash digest
SHA256	`a0ef7b4a48d68f60249111ea374d0f3a726015689f927e737002e17f5fb5733a`
MD5	`b52d038e5234976ae94f96089dfc2025`
BLAKE2b-256	`c2627907c63bb693b8976e84d83c231207d6c0836a8c2d29591968b905b47614`

See more details on using hashes here.

File details

Details for the file retab-0.0.110-py3-none-any.whl.

File metadata

Download URL: retab-0.0.110-py3-none-any.whl
Upload date: Mar 20, 2026
Size: 153.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for retab-0.0.110-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2133f4af4b03ad43de89783e75e701a8b56db125183e3737adb55c87011e0ae4`
MD5	`04291ac2a8b3fa2ef25bba1f8b9bc45a`
BLAKE2b-256	`7cf275f4101d58907dc1dfa7bdd58aeaaa9b8971e8ebf958c5bc815e89030bf7`

See more details on using hashes here.

retab 0.0.110

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Retab Python SDK

Installation

Quick Start

What `extract` Accepts

Async Extraction

Streaming Extraction

Adding Context with `additional_messages`

Working with Stored Extractions

Workflows

Notes

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

retab 0.0.110

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Retab Python SDK

Installation

Quick Start

What extract Accepts

Async Extraction

Streaming Extraction

Adding Context with additional_messages

Working with Stored Extractions

Workflows

Notes

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

What `extract` Accepts

Adding Context with `additional_messages`