The main standards for Latis Document AI project
Project description
DocumentAI-std
DocumentAI-std is a Python library designed to facilitate and standardize document analysis and processing tasks. It offers functionality for handling document elements, performing optical character recognition (OCR), and managing document datasets.
Installation
To install DocumentAI-std, you can follow these steps:
- Clone the repository from GitHub:
pip install DocumentAI-std
Example of Usage
Here's an example demonstrating how to use the Wildreceipt
dataset:
from DocumentAI_std.datasets import Wildreceipt
# Define train and test sets
train_set = Wildreceipt(
train=True,
img_folder="/path/to/train/images/",
label_path="/path/to/train/annotations.txt",
)
test_set = Wildreceipt(
train=False,
img_folder="/path/to/test/images/",
label_path="/path/to/test/annotations.txt",
)
# Assert the number of data samples in train and test sets
assert len(train_set.data) == 1267
assert len(test_set.data) == 472
In the above example:
- We import the
Wildreceipt
dataset from the DocumentAI_std library. - We create train and test dataset instances, specifying the paths to image folders and annotation files.
- We assert that the number of data samples in the train and test sets matches the expected counts.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
documentai_std-0.2.8.dev1.tar.gz
(16.5 kB
view hashes)
Built Distribution
Close
Hashes for documentai_std-0.2.8.dev1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | cb9bb98a7204ceaa3c373dd930dff7f3a88db59b28a41db64417c75a26880684 |
|
MD5 | b54d76150d3ecd6bdf579e52bcdf8cb5 |
|
BLAKE2b-256 | 269c0de0f59ca09b1c76780ec23b9a51c2a54d3d6f511216bcbd5c43c6072ba6 |
Close
Hashes for DocumentAI_std-0.2.8.dev1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d423cfecfaeb7c33b2ce84e2697dfdd9e9d6904d386c51c55a392cf1a3c78f38 |
|
MD5 | 2bb00b58d4b4a61a5be8f26f3020fca1 |
|
BLAKE2b-256 | cc0d9f95f5c04ce7b7a458e63ee6e88dcf6e9edcc39cbd223fc57d6239e43a4d |