A python library that reorders bounding boxes generated by OCR engines into the correct reading order
Project description
bbox-align
bbox-align is a Python library that reorders bounding boxes generated by OCR engines into logical lines and correct reading order for downstream document processing tasks. Even when documents have folds, irregular spacing, or distortions.
Installation
pip install bbox-align
Prereqs: Python 3.8+
Usage
import bbox_align
vertices = [
[ (0, 15), (10, 15), (10, 25), (0, 25) ], # world
[ (15, 15), (15, 15), (15, 25), (15, 25) ], # :)
[ (0, 0), (10, 0), (10, 10), (0, 10) ], # hello
]
words = ['world', ':)', 'hello']
boundaries = [ (0, 0), (25, 0), (25, 50), (0, 50) ]
lines = bbox_align.process(vertices, boundaries)
for line in lines:
sentence_list = [words[idx] for idx in line]
print(' '.join(sentence_list))
'''
$ python3 run.py
hello
world :)
'''
Examples
|
|
MOD Pizza
26902 92nd Ave NW
Suite A
Stanwood , WA 98292
Phone 360.205.9680
5/19/2019 4:06:54 PM
Order Id : AABT4HN6ACCM
# 77 - HERE
Draft Beer ( 2 @ 4.97 ) $ 9.94
Fountain Drink ( 1602 ) $ 2.17
MOD Pizza $ 8.67
Mini MOD $ 6.67
Sub Total $ 27.45
Sales Tax $ 2.53
Order Total $ 29.98
Visa $ 29.98
Tip : $ 3.59
Card # : **** ***** 3352
819160
|
|
|
Orange & Rockland 390 West Route 59
Pike County Light & Power Co. Spring Valley NY 10977-5300 Page 1 of 2
Rockland Electric Company 1-877-434-4100 www.oru.com
Your next Meter If you have questions
JOAN SMITH Reading will be about this bill , call
24 ORCHARD LN Feb 27 toll free 1-877-434-4100
-
SMALL CITY NY 19999-0000 or go to www.oru.com
ELECTRIC RESIDENTIAL - DELIVERY BILLING DATE 01/29/09
Jan 29 reading ( Actual ) 61114 BILLING SUMMARY
Dec 30 reading ( Actual ) -60597 ACCOUNT NUMBER
Total Usage KWH 30 Days 517 67890-12345
Last Bill $ 422.80
Delivery Charges Payment - EFT
Basic Service Charge $ 9.09
First 250 KWH 5.6920 each 14.23 01/12/09 -422.80
Next 267 KWH 44880 each 11.98 Service Charges
Energy Cst Adj 517 KWH -0.045000 -23
€ Electric 89.62
RDM Adjustment 517 KWH 0.16400 85 Gas 266.67
SBC / RP5 Cha 517 KWH 0.388970 201
Government surcharges - Delivery .76 TOTAL
Total Delivery Charges $ 38.69 AMOUNT DUE $ 356.29
Avg . Temp This Period 25 F
Total Supplier Charge 50.93 Same Period Last Year 33F
CURRENT ELECTRIC CHARGES $ 89.62
ELECTRIC USAGE : MONTHLY
820
GAS FIRM TRANS RES SPACE HEATING - DELIVERY 615
410
Jan 29 reading ( Actual ) 5004 205
Dec 30 reading ( Actual ) -4846 0
KWH JFMAMJJASON DJ AVG
Total Usage CCF 30 Days 158 2008 2009
* spacings computed differently
|
Workings
Two bounding boxes are considered inline if the y-coordinate of one box's vertical center lies within the top and bottom bounds of the other box.
i.e d <= D/2. To prevent skewed bounding boxes from being inline, the slope difference must be within 5°.
If there are distortions, the above check fails. So we can find the Point of intersection (poi) instead and compute a harmonic mean score using
(d1 + d2)/2- absolute vertical distance between
m1andm2.
This gives a balanced vertical score. A bounding box is inline with another bounding box with lowest score.
Any overlaps within a line are resolved also using harmonic mean score of perpendicular distance and vertical distance between overlapping bounding boxes.
Debugging
To dig deeper into what bounding boxes are inline, you can use bbox_align.process_with_meta_info to get meta related informations and pretty print them using bbox_align.pprint_matrix
import bbox_align
# For the above example
lines, inlines, passthroughs, pois = bbox_align.process_with_meta_info(vertices, endpoints)
bbox_align.pprint_matrix(matrix=passthroughs, words=words, idxs=[51, 52, 57, 58])
'''
Sales Tax $ 2.53
Sales True True False False
Tax True True False False
$ False False True True
2.53 False False True True
'''
# points of intersections
bbox_align.pprint_matrix(matrix=pois, words=words, idxs=[51, 52, 57, 58])
'''
Sales Tax $ 2.53
Sales None (26.9, 357.2) (204.17, 347.35) (200.76, 347.54)
Tax (26.9, 357.2) None (218.02, 348.89) (215.73, 348.99)
$ (204.17, 347.35) (218.02, 348.89) None (240.37, 351.37)
2.53 (200.76, 347.54) (215.73, 348.99) (240.37, 351.37) None
'''
bbox_align.pprint_matrix(matrix=inlines, words=words, idxs=[51, 52, 57, 58])
'''
Sales Tax $ 2.53
Sales True True False False
Tax True True False True
$ False False True True
2.53 False True True True
'''
# Using DFS, we group these idxs into a line
FAQs
1. Does this library do document layout analysis?
No. Document layout analysis is an upstream task. bbox-align only groups bounding boxes within a layout into logical lines.
2. Does this library compute spacings as well? Not yet. But with enough requests and interests, I can implement it.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bbox_align-0.2.5.tar.gz.
File metadata
- Download URL: bbox_align-0.2.5.tar.gz
- Upload date:
- Size: 557.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8d054abeef3613254c42db41a78c9bf65b1f64bcab9f5796b3aefd88b7ce1c5
|
|
| MD5 |
06506be6293bd02e912150afacb8355e
|
|
| BLAKE2b-256 |
2a5d315be4fa53c338e791aebedac2f26489ef5c61475bdd3c6e611bb1d7ef1b
|
File details
Details for the file bbox_align-0.2.5-py3-none-any.whl.
File metadata
- Download URL: bbox_align-0.2.5-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2579e24cc4968aaff5ce97973cd1de04bbb2ac8cbf2a570baa1f24bcc843b9b3
|
|
| MD5 |
5b0136ee5b6209cfc477ee788a12e3df
|
|
| BLAKE2b-256 |
88108f0f9398c369b10ba35848924cd7fd88c652bb136552e6076755eb486ab7
|