Pixelwise binarization with selectional auto-encoders in Keras
Project description
Binarization
Binarization for document images
Examples
Introduction
This tool performs document image binarization (i.e. transform colour/grayscale to black-and-white pixels) for OCR using multiple trained models.
The method used is based on Calvo-Zaragoza/Gallego, 2018. A selectional auto-encoder approach for document image binarization.
Installation
Clone the repository, enter it and run
pip install .
Models
Pre-trained models can be downloaded from here:
https://qurator-data.de/sbb_binarization/
Usage
sbb_binarize \
--patches \
-m <directory with models> \
<input image> \
<output image>
Note In virtually all cases, the --patches
flag will improve results.
To use the OCR-D interface:
ocrd-sbb-binarize --overwrite -I INPUT_FILE_GRP -O OCR-D-IMG-BIN -P model "/var/lib/sbb_binarization"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sbb_binarization-0.0.8.tar.gz
(10.2 kB
view hashes)
Built Distribution
Close
Hashes for sbb_binarization-0.0.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8740e16c0ea52d8c6f3caa6ae13769d84a171d65c13553d8aacf8e98c3e63fa2 |
|
MD5 | bb260551a740f451114ef548c5979325 |
|
BLAKE2b-256 | 94ea5f07e3bda0841bbaa88b6acd212395f80f4dc0025dd9e05dcab74e56e04e |