text extractor from images
Project description
pytextractor
python ocr using tesseract/ with EAST opencv text detector
Uses the EAST opencv detector defined here with pytesseract to extract text(default) or numbers from images.
Usage main
usage: text_detection.py [-h] [--east EAST] [-c CONFIDENCE] [-w WIDTH]
[-e HEIGHT] [-d] [-n] [-p PERCENTAGE] [-b MIN_BOXES]
[-i MAX_ITERATIONS]
images [images ...]
Text/Number extractor from image
positional arguments:
images path(s) to input image(s)
optional arguments:
-h, --help show this help message and exit
--east EAST path to input EAST text detector
-c CONFIDENCE, --confidence CONFIDENCE
minimum probability required to inspect a region
-w WIDTH, --width WIDTH
resized image width (should be multiple of 32)
-e HEIGHT, --height HEIGHT
resized image height (should be multiple of 32)
-d, --display Display bounding boxes
-n, --numbers Detect only numbers
-p PERCENTAGE, --percentage PERCENTAGE
Expand/shrink detected bound box
-b MIN_BOXES, --min-boxes MIN_BOXES
minimum number of detected boxes to return
-i MAX_ITERATIONS, --max-iterations MAX_ITERATIONS
max number of iterations finding min_boxes
Usage lib
from pytextractor import pytextractor
extractor = pytextractor.PyTextractor()
Running tests
python setup.py test
- make sure tesseract is installed *
brew | apt-get install tesseract
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pytextractor-1.0.0.tar.gz
(763.6 kB
view hashes)
Built Distribution
Close
Hashes for pytextractor-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a195e697055ff77a58330f0bf0c2ed2e36d7611ed8149f6d287e7b28b0c23b7 |
|
MD5 | fb7d3176ceddd659b55e5c7429bf1368 |
|
BLAKE2b-256 | 4ff13990cd104ab4af70e4f47a3d39b05a1a01efc65e3df55bb76074cb0fb653 |