text extractor from images
Project description
pytextractor
python ocr using tesseract with EAST opencv text detector
Uses the EAST opencv detector with pytesseract to extract text(default) or numbers from images.
Usage main
usage: text_detector [-h] [--east EAST] [-c CONFIDENCE] [-w WIDTH] [-e HEIGHT] [-d] [-n] [-p PERCENTAGE]
[-b MIN_BOXES] [-i MAX_ITERATIONS]
images [images ...]
Text/Number extractor from image
positional arguments:
images path(s) to input image(s)
options:
-h, --help show this help message and exit
--east EAST path to input EAST text detector
-c CONFIDENCE, --confidence CONFIDENCE
minimum probability required to inspect a region
-w WIDTH, --width WIDTH
resized image width (should be multiple of 32)
-e HEIGHT, --height HEIGHT
resized image height (should be multiple of 32)
-d, --display Display bounding boxes
-n, --numbers Detect only numbers
-p PERCENTAGE, --percentage PERCENTAGE
Expand/shrink detected bound box
-b MIN_BOXES, --min-boxes MIN_BOXES
minimum number of detected boxes to return
-i MAX_ITERATIONS, --max-iterations MAX_ITERATIONS
Installation & usage
brew install tesseract
pipx install pytextractor
text_detector
Usage lib
from pytextractor import pytextractor
extractor = pytextractor.PyTextractor()
Running tests
brew install tesseract
python -mvenv .venv --prompt .
. ./.venv/bin/activate
pip install ".[dev]"
pytest -s tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pytextractor-2.0.0.tar.gz
(764.1 kB
view hashes)
Built Distribution
Close
Hashes for pytextractor-2.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67226c479832d366b2a877be2c44ba095b4ca46b4f5a92fb53c24b4b87d66cc1 |
|
MD5 | f2091232cd4960e567e77334475bdc3f |
|
BLAKE2b-256 | 5038520dcb009ea4416b67af1fef2320c3be9ebcc9f60ceb9377e665ff2d07dc |