Skip to main content

A cross-platform framework for deep learning based text detection, recoginition and parsing

Project description

PyPI version Conda version CI

deeptext

A cross-platform framework for deep learning based text detection, recoginition and parsing

Getting started

Installation

  • Install using conda for Linux, Mac and Windows (preferred):
conda install -c fcakyon deeptext
  • Install using pip for Linux and Mac:
pip install deeptext

Install teserract-ocr for text recognition.

Basic Usage

# import package
import deeptext

# set image path and export folder directory
image_path = 'idcard.png'
output_dir = 'outputs/'

# apply text detection and export detected regions to output directory
detection_result = deeptext.detect_text(image_path, output_dir)

# apply text recognition to detected texts
recognition_result = deeptext.recognize_text(image_path=detection_result["text_crop_paths"])

Advanced Usage

You can pass filter parameters if you want to scrap texts from image by predefined regions.

# import package
import deeptext

# set image path and export folder directory
image_path = 'idcard.png'
output_dir = 'outputs/'

# define regions that you want to scrap, by quad (box) points
filter_params = {"type": "box"
                 "boxes": [[[0.1460 , 0.0395],
                            [0.8417, 0.0535],
                            [0.8412, 0.1099],
                            [0.1455, 0.0959]],
                           [[0.3467, 0.3398],
                            [0.5417, 0.3535],
                            [0.5412, 0.4099],
                            [0.3455, 0.3959]]],
                 "marigin_x": 0.05,
                 "marigin_y": 0.05,
                 "min_intersection_ratio": 0.9}

# or define regions that you want to scrap, by centroids
filter_params = {"type":"centroid",
                 "centers": [[0.44, 0.49],[0.49, 0.08]],
                 "marigin_x": 0.03,
                 "marigin_y": 0.05}

# apply craft text detection in predefined regions and export detected regions to output directory
detection_result = deeptext.detect_text(image_path,
                                         output_dir,
                                         detector="craft",
                                         filter_params=filter_params)

# apply tesseract (eng) text recognition to detected texts
recognition_result = deeptext.recognize_text(image_path=detection_result["text_crop_paths"],
                                             recognizer="tesseract-eng")

Updates

6 April, 2020: Conda package release

3 April, 2020: Tesseract text recoginition and positional text scraping support

30 March, 2020: Craft text detector support

TODO

  • Craft text detection (inference)
  • Ctpn text detection (inference)
  • Psenet text detection (inference)
  • Tesseract text recoginition (inference)
  • Aster text recognition (training and inference)
  • Moran text recognition (training and inference)
  • Positional text scraping

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deeptext-0.1.3.tar.gz (6.4 kB view hashes)

Uploaded Source

Built Distribution

deeptext-0.1.3-py3-none-any.whl (7.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page