boxdetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like characters boxes on scanned forms.
Project description
BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on scanned forms.
Main purpose of this library is to provide helpful functions for processing document images like bank forms, applications, etc. and extract regions where character boxes or tick/check boxes are present.
Getting Started
Checkout the examples below and get-started.ipynb notebook which holds end to end examples for using BoxDetect.
Installation
BoxDetect can be installed directly from this repo using pip
:
pip install git+https://github.com/karolzak/boxdetect
or through PyPI
pip install boxdetect
Usage examples
You can use BoxDetect
either by leveraging one of the pre-made pipelines or by treating it as a toolbox to compose your own pipelines that fits your needs perfectly.
Using existing pipelines:
Start with getting the default config and modifying it for your requirements and data:
from boxdetect import config
file_name = 'bank_form1.png'
# important to adjust these values to match the size of boxes on your image
config.min_w, config.max_w = (35,48)
config.min_h, config.max_h = (30,37)
# the more scaling factors the more accurate the results but also it takes more time to processing
# too small scaling factor may cause false positives
# too big scaling factor will take a lot of processing time
config.scaling_factors = [0.4, 0.5, 0.7]
# w/h ratio range for boxes/rectangles filtering
config.wh_ratio_range = (0.5, 1.5)
# num of iterations when running dilation tranformation (to engance the image)
config.dilation_iterations = 1
As a second step simply run:
from boxdetect.pipelines import get_boxes
rects, grouping_rects, image, output_image = get_boxes(
file_name, config=config, plot=False)
Each of the returned elements are rectangular bounding boxes representing grouped character boxes (x, y, w, h)
print(grouping_rects)
OUT:
# (x, y, w, h)
[(276, 276, 1221, 33),
(324, 466, 430, 33),
(384, 884, 442, 33),
(985, 952, 410, 32),
(779, 1052, 156, 33),
(253, 1256, 445, 33)]
plt.figure(figsize=(20,20))
plt.imshow(output_image)
plt.show()
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for boxdetect-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa1b592e953b9d09adb26987a53a4c049c01c0ff4ea7232b5b9e16107588d76e |
|
MD5 | 6208797c9f3813b87bacc48b49db2e61 |
|
BLAKE2b-256 | 719043bb5f914350fab6d705ad9881fa39134e897b1b636cdd384325ab7eab10 |