Amazon Textract Overlay tools
Project description
Textract-Overlayer
amazon-textract-overlayer provides functions to help overlay bounding boxes on documents.
Install
> python -m pip install amazon-textract-overlayer
Make sure your environment is setup with AWS credentials through configuration files or environment variables or an attached role. (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html)
Samples
Primary method provided is get_bounding_boxes which returns bounding boxes based on the Textract_Type passed in.
Mostly taken from the amazon-textract
command from the package amazon-textract-helper
.
This will return the bounding boxes for WORD and CELL data types.
from textractoverlayer.t_overlay import DocumentDimensions, get_bounding_boxes
from textractcaller.t_call import Textract_Features, Textract_Types, call_textract
doc = call_textract(input_document=input_document, features=features)
# image is a PIL.Image.Image in this case
document_dimension:DocumentDimensions = DocumentDimensions(doc_width=image.size[0], doc_height=image.size[1])
overlay=[Textract_Types.WORD, Textract_Types.CELL]
bounding_box_list = get_bounding_boxes(textract_json=doc, document_dimensions=document_dimension, overlay_features=overlay)
The actual overlay drawing of bounding boxes for images is in the amazon-textract
command from the package amazon-textract-helper
and looks like this:
from PIL import Image, ImageDraw
image = Image.open(input_document)
rgb_im = image.convert('RGB')
draw = ImageDraw.Draw(rgb_im)
# check the impl in amazon-textract-helper for ways to associate different colors to types
for bbox in bounding_box_list:
draw.rectangle(xy=[bbox.xmin, bbox.ymin, bbox.xmax, bbox.ymax], outline=(128, 128, 0), width=2)
rgb_im.show()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for amazon-textract-overlayer-0.0.7.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5140e4ef59ee509b4413edd9ebcabb10bf722d68d207a67171cad9c1bb075b2 |
|
MD5 | e206a4c5b3aff0fea5dc9f7f6f5bffc0 |
|
BLAKE2b-256 | 183c8889b9a51b448e8f6ea4c55f8e79c4254a7eeaf6ec0b3ddd8eb0797c12b1 |
Hashes for amazon_textract_overlayer-0.0.7-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1022b118238cdd1159888dd3a15a4054540bc839a417a220cf2fc2f840ac14a |
|
MD5 | 9153f8a9d14efead500459d8ffb0a132 |
|
BLAKE2b-256 | 99f814ba0c7df332730a55960d90ae5ec4689a5f5c9ae0277ed574933323bf7e |