Skip to main content

A cross-platform framework for deep learning based text detection, recoginition and parsing

Project description

PyPI version Conda version CI

deeptext

A cross-platform framework for deep learning based text detection, recoginition and parsing

Getting started

Installation

  • Install using conda for Linux, Mac and Windows (preferred):
conda install -c fcakyon deeptext
  • Install using pip for Linux and Mac:
pip install deeptext

Install teserract-ocr for text recognition.

Basic Usage

# import package
import deeptext

# set image path and export folder directory
image_path = 'idcard.png'
output_dir = 'outputs/'

# apply text detection and export detected regions to output directory
detection_result = deeptext.detect_text(image_path, output_dir)

# apply text recognition to detected texts
recognition_result = deeptext.recognize_text(image_path=detection_result["text_crop_paths"])

Advanced Usage

You can pass filter parameters if you want to scrap texts from image by predefined regions.

# import package
import deeptext

# set image path and export folder directory
image_path = 'idcard.png'
output_dir = 'outputs/'

# define regions that you want to scrap, by quad (box) points
filter_params = {"type": "box"
                 "boxes": [[[0.1460 , 0.0395],
                            [0.8417, 0.0535],
                            [0.8412, 0.1099],
                            [0.1455, 0.0959]],
                           [[0.3467, 0.3398],
                            [0.5417, 0.3535],
                            [0.5412, 0.4099],
                            [0.3455, 0.3959]]],
                 "marigin_x": 0.05,
                 "marigin_y": 0.05,
                 "min_intersection_ratio": 0.9}

# or define regions that you want to scrap, by centroids
filter_params = {"type":"centroid",
                 "centers": [[0.44, 0.49],[0.49, 0.08]],
                 "marigin_x": 0.03,
                 "marigin_y": 0.05}

# apply craft text detection in predefined regions and export detected regions to output directory
detection_result = deeptext.detect_text(image_path,
                                         output_dir,
                                         detector="craft",
                                         filter_params=filter_params)

# apply tesseract (eng) text recognition to detected texts
recognition_result = deeptext.recognize_text(image_path=detection_result["text_crop_paths"],
                                             recognizer="tesseract-eng")

Updates

6 April, 2020: Conda package release

3 April, 2020: Tesseract text recoginition and positional text scraping support

30 March, 2020: Craft text detector support

TODO

  • Craft text detection (inference)
  • Ctpn text detection (inference)
  • Psenet text detection (inference)
  • Tesseract text recoginition (inference)
  • Aster text recognition (training and inference)
  • Moran text recognition (training and inference)
  • Positional text scraping

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deeptext-0.1.3.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

deeptext-0.1.3-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file deeptext-0.1.3.tar.gz.

File metadata

  • Download URL: deeptext-0.1.3.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2

File hashes

Hashes for deeptext-0.1.3.tar.gz
Algorithm Hash digest
SHA256 42a1f8a878264b5acf7df1739f3ac22a20394e052d71d5a68cd608e9a3262f14
MD5 439cc1503baacd8d57746a0fe09b0d0a
BLAKE2b-256 a2aa24c6db61bef1018c4dd165444ed095693dbcf3c71db480013db7061d7dda

See more details on using hashes here.

File details

Details for the file deeptext-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: deeptext-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.8.2

File hashes

Hashes for deeptext-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 db13ef88fecfff2f29f00077c3197b8d6a44c7b88d1b19c425fff5b384e40c6c
MD5 a98e238c4cbc20f7102fec5541571998
BLAKE2b-256 47177900bb796b0ecbbb115ef4c33c047da5ac365d01f23b6bf9172417904c36

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page