Skip to main content

boxdetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like characters boxes on scanned forms.

Project description

BoxDetect is a Python package based on OpenCV which allows you to easily detect rectangular shapes like character or checkbox boxes on scanned forms.

Main purpose of this library is to provide helpful functions for processing document images like bank forms, applications, etc. and extract regions where character boxes or tick/check boxes are present.

Getting Started

Checkout the examples below and get-started.ipynb notebook which holds end to end examples for using BoxDetect.

Installation

BoxDetect can be installed directly from this repo using pip:

pip install git+https://github.com/karolzak/boxdetect

or through PyPI

pip install boxdetect

Usage examples

You can use BoxDetect either by leveraging one of the pre-made pipelines or by treating it as a toolbox to compose your own pipelines that fits your needs perfectly.

Using existing pipelines:

Start with getting the default config and modifying it for your requirements and data:

from boxdetect import config

file_name = 'bank_form1.png'
# important to adjust these values to match the size of boxes on your image
config.min_w, config.max_w = (35,48)
config.min_h, config.max_h = (30,37)
# the more scaling factors the more accurate the results but also it takes more time to processing
# too small scaling factor may cause false positives
# too big scaling factor will take a lot of processing time
config.scaling_factors = [0.4, 0.5, 0.7]
# w/h ratio range for boxes/rectangles filtering
config.wh_ratio_range = (0.5, 1.5)
# num of iterations when running dilation tranformation (to engance the image)
config.dilation_iterations = 1

As a second step simply run:

from boxdetect.pipelines import get_boxes

rects, grouping_rects, image, output_image = get_boxes(
    file_name, config=config, plot=False)

Each of the returned elements are rectangular bounding boxes representing grouped character boxes (x, y, w, h)

print(grouping_rects)

OUT:
# (x, y, w, h)
[(276, 276, 1221, 33),
 (324, 466, 430, 33),
 (384, 884, 442, 33),
 (985, 952, 410, 32),
 (779, 1052, 156, 33),
 (253, 1256, 445, 33)]
plt.figure(figsize=(20,20))
plt.imshow(output_image)
plt.show()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

boxdetect-0.1.0.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

boxdetect-0.1.0-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file boxdetect-0.1.0.tar.gz.

File metadata

  • Download URL: boxdetect-0.1.0.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for boxdetect-0.1.0.tar.gz
Algorithm Hash digest
SHA256 56eedaf6e6faca861f4d2accec877f60aa3be4d9033b1fd3a1a7a0d8672df899
MD5 e9ff528f7216c06b1a5fbcc2ef99dae3
BLAKE2b-256 6409ee937df78b0bb7bb5c113db6f4b4ffa2760a278ff9596097340cf3b88597

See more details on using hashes here.

File details

Details for the file boxdetect-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: boxdetect-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for boxdetect-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa1b592e953b9d09adb26987a53a4c049c01c0ff4ea7232b5b9e16107588d76e
MD5 6208797c9f3813b87bacc48b49db2e61
BLAKE2b-256 719043bb5f914350fab6d705ad9881fa39134e897b1b636cdd384325ab7eab10

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page