Skip to main content

A Python framework for multi-modal document understanding with generative AI

Project description

Rhubarb

Amazon Bedrock License made-with-python Python 3.11 Ruff

Rhubarb

Rhubarb is a light-weight Python framework that makes it easy to build document understanding applications using Multi-modal Large Language Models (LLMs) and Embedding models. Rhubarb is created from the ground up to work with Amazon Bedrock and supports multiple foundation models including Anthropic Claude V3 Multi-modal Language Models and Amazon Nova models for document processing, along with Amazon Titan Multi-modal Embedding model for embeddings.

What can I do with Rhubarb?

Visit Rhubarb documentation.

Rhubarb can do multiple document processing tasks such as

  • ✅ Document Q&A
  • ✅ Streaming chat with documents (Q&A)
  • ✅ Document Summarization
    • 🚀 Page level summaries
    • 🚀 Full summaries
    • 🚀 Summaries of specific pages
    • 🚀 Streaming Summaries
  • ✅ Structured data extraction
  • ✅ Extraction Schema creation assistance
  • ✅ Named entity recognition (NER)
    • 🚀 With 50 built-in common entities
  • ✅ PII recognition with built-in entities
  • ✅ Figure and image understanding from documents
    • 🚀 Explain charts, graphs, and figures
    • 🚀 Perform table reasoning (as figures)
  • ✅ Large document processing with sliding window approach
  • ✅ Document Classification with vector sampling using multi-modal embedding models
  • ✅ Logs token usage to help keep track of costs

Rhubarb comes with built-in system prompts that makes it easy to use it for a number of different document understanding use-cases. You can customize Rhubarb by passing in your own system prompts. It supports exact JSON schema based output generation which makes it easy to integrate into downstream applications.

  • Supports PDF, TIFF, PNG, JPG, DOCX files (support for Excel, PowerPoint, CSV, Webp, eml files coming soon)
  • Performs document to image conversion internally to work with the multi-modal models
  • Works on local files or files stored in S3
  • Supports specifying page numbers for multi-page documents
  • Supports chat-history based chat for documents
  • Supports streaming and non-streaming mode
  • Supports Converse API
  • Supports Cross-Region Inference

Installation

Start by installing Rhubarb using pip.

pip install pyrhubarb

Usage

Create a boto3 session.

import boto3
session = boto3.Session()

Call Rhubarb

Local file

from rhubarb import DocAnalysis

da = DocAnalysis(file_path="./path/to/doc/doc.pdf", 
                 boto3_session=session)
resp = da.run(message="What is the employee's name?")
resp

With file in Amazon S3

from rhubarb import DocAnalysis

da = DocAnalysis(file_path="s3://path/to/doc/doc.pdf", 
                 boto3_session=session)
resp = da.run(message="What is the employee's name?")
resp

Large Document Processing

Rhubarb supports processing documents with more than 20 pages using a sliding window approach. This feature is particularly useful when working with Claude models, which have a limitation of processing only 20 pages at a time.

To enable this feature, set sliding_window_overlap to a value between 1 and 10 when creating a DocAnalysis object:

doc_analysis = DocAnalysis(
    file_path="path/to/large-document.pdf",
    boto3_session=session,
    sliding_window_overlap=2     # Number of pages to overlap between windows (1-10)
)

When the sliding window approach is enabled, Rhubarb will:

  1. Break the document into chunks of 20 pages
  2. Process each chunk separately
  3. Combine the results from all chunks

Note: The sliding window technique is not yet supported for document classification. When using classification with large documents, only the first 20 pages will be considered.

For more details, see the Large Document Processing Cookbook.

For more usage examples see cookbooks.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyrhubarb-0.0.5.tar.gz (46.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyrhubarb-0.0.5-py3-none-any.whl (62.2 kB view details)

Uploaded Python 3

File details

Details for the file pyrhubarb-0.0.5.tar.gz.

File metadata

  • Download URL: pyrhubarb-0.0.5.tar.gz
  • Upload date:
  • Size: 46.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.10.16 Linux/6.8.0-1021-azure

File hashes

Hashes for pyrhubarb-0.0.5.tar.gz
Algorithm Hash digest
SHA256 cf598d27def3535cf442ead0facb7bdfb66538240abc6b4cb18d1b154a673110
MD5 1ffc5d7fb572928c9c7726549e09c8e2
BLAKE2b-256 5de05521ce56af996f7d65f40dadb1f8ae5a19663caac0c7eed984e7f53662ea

See more details on using hashes here.

File details

Details for the file pyrhubarb-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: pyrhubarb-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 62.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.1 CPython/3.10.16 Linux/6.8.0-1021-azure

File hashes

Hashes for pyrhubarb-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 5b2c707a9dd764fa5aae84ded79337c5feedba99a6734e1b8e813299ae87d86d
MD5 5be6b9c0a1ed460fbcef3720a91a5e42
BLAKE2b-256 254cd7a3e16bd4552d52d424a77906346149a0f204238fc1cd582675777dc32e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page