Skip to main content

Client library for LLM Whisperer

Project description

LLMWhisperer Python Client

PyPI - Downloads Python Version from PEP 621 TOML PyPI - Version

LLMs are powerful, but their output is as good as the input you provide. LLMWhisperer is a technology that presents data from complex documents (different designs and formats) to LLMs in a way that they can best understand. LLMWhisperer features include Layout Preserving Mode, Auto-switching between native text and OCR modes, proper representation of radio buttons and checkboxes in PDF forms as raw text, among other features. You can now extract raw text from complex PDF documents or images without having to worry about whether the document is a native text document, a scanned image or just a picture clicked on a smartphone. Extraction of raw text from invoices, purchase orders, bank statements, etc works easily for structured data extraction with LLMs powered by LLMWhisperer's Layout Preserving mode.

Refer to the client documentation for more information: LLMWhisperer Client Documentation

Features

  • Easy to use Pythonic interface.
  • Handles all the HTTP requests and responses for you.
  • Raises Python exceptions for API errors.

Installation

You can install the LLMWhisperer Python Client using pip:

pip install llmwhisperer-client

Usage

First, import the LLMWhispererClient from the client module:

from unstract.llmwhisperer.client import LLMWhispererClient

Then, create an instance of the LLMWhispererClient:

client = LLMWhispererClient(base_url="https://llmwhisperer-api.unstract.com/v1", api_key="your_api_key")

Now, you can use the client to interact with the LLMWhisperer API:

# Get usage info
usage_info = client.get_usage_info()

# Process a document
# Extracted text is available in the 'extracted_text' field of the result
whisper = client.whisper(file_path="path_to_your_file")

# Get the status of a whisper operation
# whisper_hash is available in the 'whisper_hash' field of the result of the whisper operation
status = client.whisper_status(whisper_hash)

# Retrieve the result of a whisper operation
# whisper_hash is available in the 'whisper_hash' field of the result of the whisper operation
whisper = client.whisper_retrieve(whisper_hash)

Error Handling

The client raises LLMWhispererClientException for API errors:

try:
    result = client.whisper_retrieve("invalid_hash")
except LLMWhispererClientException as e:
    print(f"Error: {e.message}, Status Code: {e.status_code}")

Simple use case with defaults

client = LLMWhispererClient()
try:
    result = client.whisper(file_path="sample_files/restaurant_invoice_photo.pdf")
    extracted_text = result["extracted_text"]
    print(extracted_text)
except LLMWhispererClientException as e:
    print(e)

Simple use case with more options set

We are forcing text processing and extracting text from the first two pages only.

client = LLMWhispererClient()
try:
    result = client.whisper(
        file_path="sample_files/credit_card.pdf",
        processing_mode="text",
        force_text_processing=True,
        pages_to_extract="1,2",
    )
    extracted_text = result["extracted_text"]
    print(extracted_text)
except LLMWhispererClientException as e:
    print(e)

Extraction with timeout set

The platform has a hard timeout of 200 seconds. If the document takes more than 200 seconds to convert (large documents), the platform will switch to async extraction and return a hash. The client can be used to check the status of the extraction and retrieve the result. Also note that the timeout is in seconds and can be set by the caller too.

client = LLMWhispererClient()
try:
    result = client.whisper(
        file_path="sample_files/credit_card.pdf",
        pages_to_extract="1,2",
        timeout=2,
    )
    if result["status_code"] == 202:
        print("Timeout occured. Whisper request accepted.")
        print(f"Whisper hash: {result['whisper-hash']}")
        while True:
            print("Polling for whisper status...")
            status = client.whisper_status(whisper_hash=result["whisper-hash"])
            if status["status"] == "processing":
                print("STATUS: processing...")
            elif status["status"] == "delivered":
                print("STATUS: Already delivered!")
                break
            elif status["status"] == "unknown":
                print("STATUS: unknown...")
                break
            elif status["status"] == "processed":
                print("STATUS: processed!")
                print("Let's retrieve the result of the extraction...")
                resultx = client.whisper_retrieve(
                    whisper_hash=result["whisper-hash"]
                )
                print(resultx["extracted_text"])
                break
            time.sleep(2)
except LLMWhispererClientException as e:
    print(e)

Questions and Feedback

On Slack, join great conversations around LLMs, their ecosystem and leveraging them to automate the previously unautomatable!

LLMWhisperer Playground: Test drive LLMWhisperer with your own documents. No sign up needed!

LLMWhisperer developer documentation and playground: Learn more about LLMWhisperer and its API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmwhisperer_client-0.2.1.tar.gz (3.1 MB view details)

Uploaded Source

Built Distribution

llmwhisperer_client-0.2.1-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file llmwhisperer_client-0.2.1.tar.gz.

File metadata

  • Download URL: llmwhisperer_client-0.2.1.tar.gz
  • Upload date:
  • Size: 3.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: pdm/2.10.0 CPython/3.8.10

File hashes

Hashes for llmwhisperer_client-0.2.1.tar.gz
Algorithm Hash digest
SHA256 d6b207619a05783c6d5c45c7c23da67ceb5c47a9926c0fff4e625dbd99984bd0
MD5 81efa8b83a7f64ecc63c29641ff5a9d8
BLAKE2b-256 f47e0990678674e06474e28c08bf5e0e0c314d531c88155fd098233cc1461dc6

See more details on using hashes here.

File details

Details for the file llmwhisperer_client-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for llmwhisperer_client-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0156888d2a8555c2cd9893b8d6b6fc6eb5e5a36a45f604b2d60dbb8026e57ecc
MD5 de91fdfde35622537438f01eb24da5ef
BLAKE2b-256 9231929bcc40e56f25fa0c3c10ecff96f339ef4a3141698eb48d935a4a7e84c3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page