akaOCR Package Tools
Project description
akaOCR
Description
This package is compatible with akaOCR for ocr pipeline program (Text Detection, Text Recognition & Text Rotation), using ONNX format model (CPU & GPU speed can be x2 Times Faster). This code is referenced from this awesome repo.
Features
Feature 1: Text Detection.
from akaocr import BoxEngine
import cv2
img_path = "path/to/image.jpg"
image = cv2.imread(img_path)
# side_len: minimum inference image size
box_engine = BoxEngine(model_path: Any | None = None,
side_len: int | None = None,
conf_thres: float = 0.5
mask_thes: float = 0.4,
unclip_ratio: float = 2.0,
max_candidates: int = 1000,
device: str = 'cpu | gpu')
# inference for one image
results = box_engine(image) # [np.array([4 points], dtype=np.float32),...]
Feature 2: Text Recognition.
from akaocr import TextEngine
import cv2
img_path = "path/to/cropped_image.jpg"
cropped_image = cv2.imread(img_path)
text_engine = TextEngine(model_path: Any | None = None,
vocab_path: Any | None = None,
use_space_char: bool = True,
batch_sizes: int = 32,
model_shape: list = [3, 48, 320],
max_wh_ratio: int = None,
device: str = 'cpu | gpu')
# inference for one or more images
results = text_engine(cropped_image) # [(text, conf),...]
Feature 3: Text Rotation.
from akaocr import ClsEngine
import cv2
img_path = "path/to/cropped_image.jpg"
cropped_image = cv2.imread(img_path)
rotate_engine = ClsEngine(model_path: Any | None = None,
conf_thres: float = 0.75
device: str='cpu | gpu')
# classify input image as 0 or 180 degrees
# inference for one or more images
results = rotate_engine(cropped_image) # [(lable (0|180), conf),...]
Usage
- Perspective Transform Image.
import numpy as np
def transform_image(image, box):
# Get perspective transform image
assert len(box) == 4, "Shape of points must be 4x2"
img_crop_width = int(
max(
np.linalg.norm(box[0] - box[1]),
np.linalg.norm(box[2] - box[3])))
img_crop_height = int(
max(
np.linalg.norm(box[0] - box[3]),
np.linalg.norm(box[1] - box[2])))
pts_std = np.float32([[0, 0],
[img_crop_width, 0],
[img_crop_width, img_crop_height],
[0, img_crop_height]])
M = cv2.getPerspectiveTransform(box, pts_std)
dst_img = cv2.warpPerspective(
image,
M, (img_crop_width, img_crop_height),
borderMode=cv2.BORDER_REPLICATE,
flags=cv2.INTER_CUBIC)
img_height, img_width = dst_img.shape[0:2]
if img_height/img_width >= 1.5:
dst_img = np.rot90(dst_img, k=3)
return dst_img
- OCR Pipeline.
from akaocr import TextEngine
from akaocr import BoxEngine
import cv2
box_engine = BoxEngine()
text_engine = TextEngine()
def main():
img_path = "sample.png"
org_image = cv2.imread(img_path)
images = []
boxes = box_engine(org_image)
for box in boxes:
# crop & transform image
image = transform_image(org_image, box)
images.append(image)
texts = text_engine(images)
if __name__ == '__main__':
main()
Note: akaOCR (Transform documents into useful data with AI-based IDP - Intelligent Document Processing) - helps make inefficient manual entry a thing of the past—and reliable data insights a thing of the present. Details at: https://app.akaocr.io
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file akaocr-2.1.4.tar.gz.
File metadata
- Download URL: akaocr-2.1.4.tar.gz
- Upload date:
- Size: 10.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8d801f13364d15753cebeb89d7912941157c06e8df6ec8d230fa267c06318a91
|
|
| MD5 |
a38ae64bdb1eacc97e3dc1f950f9daba
|
|
| BLAKE2b-256 |
998fec00a411544a084f3bb8f24a21d7804f71a39a8090b5a4be5d8d43ef3610
|
File details
Details for the file akaocr-2.1.4-py3-none-any.whl.
File metadata
- Download URL: akaocr-2.1.4-py3-none-any.whl
- Upload date:
- Size: 10.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca6cdb2870fcfe98c34e4bd93367a9f961a8b19b88d717917302a3781c217cf0
|
|
| MD5 |
acfa61e8096b4211b0e9335b774b5ae1
|
|
| BLAKE2b-256 |
f4807d7df02bc98d9d0f18a0e3fa5a91958b345bd84c85a46e9a768eb3c7c509
|