Skip to main content

Amazon Textract Overlay tools

Project description

Textract-Overlayer

amazon-textract-overlayer provides functions to help overlay bounding boxes on documents.

Install

> python -m pip install amazon-textract-overlayer

Make sure your environment is setup with AWS credentials through configuration files or environment variables or an attached role. (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)

Samples

Primary method provided is get_bounding_boxes which returns bounding boxes based on the Textract_Type passed in. Mostly taken from the amazon-textract command from the package amazon-textract-helper.

This will return the bounding boxes for WORD and CELL data types.

from textractoverlayer.t_overlay import DocumentDimensions, get_bounding_boxes
from textractcaller.t_call import Textract_Features, Textract_Types, call_textract

doc = call_textract(input_document=input_document, features=features)
# image is a PIL.Image.Image in this case
document_dimension:DocumentDimensions = DocumentDimensions(doc_width=image.size[0], doc_height=image.size[1])
overlay=[Textract_Types.WORD, Textract_Types.CELL]

bounding_box_list = get_bounding_boxes(textract_json=doc, document_dimensions=document_dimension, overlay_features=overlay)

The actual overlay drawing of bounding boxes for images is in the amazon-textract command from the package amazon-textract-helper and looks like this:

from PIL import Image, ImageDraw

image = Image.open(input_document)
rgb_im = image.convert('RGB')
draw = ImageDraw.Draw(rgb_im)

# check the impl in amazon-textract-helper for ways to associate different colors to types
for bbox in bounding_box_list:
    draw.rectangle(xy=[bbox.xmin, bbox.ymin, bbox.xmax, bbox.ymax], outline=(128, 128, 0), width=2)

rgb_im.show()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amazon-textract-overlayer-0.0.7.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

amazon_textract_overlayer-0.0.7-py2.py3-none-any.whl (9.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file amazon-textract-overlayer-0.0.7.tar.gz.

File metadata

  • Download URL: amazon-textract-overlayer-0.0.7.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for amazon-textract-overlayer-0.0.7.tar.gz
Algorithm Hash digest
SHA256 b5140e4ef59ee509b4413edd9ebcabb10bf722d68d207a67171cad9c1bb075b2
MD5 e206a4c5b3aff0fea5dc9f7f6f5bffc0
BLAKE2b-256 183c8889b9a51b448e8f6ea4c55f8e79c4254a7eeaf6ec0b3ddd8eb0797c12b1

See more details on using hashes here.

File details

Details for the file amazon_textract_overlayer-0.0.7-py2.py3-none-any.whl.

File metadata

  • Download URL: amazon_textract_overlayer-0.0.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.0 CPython/3.9.6

File hashes

Hashes for amazon_textract_overlayer-0.0.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e1022b118238cdd1159888dd3a15a4054540bc839a417a220cf2fc2f840ac14a
MD5 9153f8a9d14efead500459d8ffb0a132
BLAKE2b-256 99f814ba0c7df332730a55960d90ae5ec4689a5f5c9ae0277ed574933323bf7e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page