Skip to main content

Pixelwise binarization with selectional auto-encoders in Keras

Project description

Binarization

Binarization for document images

Examples

Introduction

This tool performs document image binarization (i.e. transform colour/grayscale to black-and-white pixels) for OCR using multiple trained models.

The method used is based on Calvo-Zaragoza/Gallego, 2018. A selectional auto-encoder approach for document image binarization.

Installation

Clone the repository, enter it and run

pip install .

Models

Pre-trained models can be downloaded from here:

https://qurator-data.de/sbb_binarization/

Usage

sbb_binarize \
  --patches \
  -m <directory with models> \
  <input image> \
  <output image>

Note In virtually all cases, the --patches flag will improve results.

To use the OCR-D interface:

ocrd-sbb-binarize --overwrite -I INPUT_FILE_GRP -O OCR-D-IMG-BIN -P model "/var/lib/sbb_binarization"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sbb_binarization-0.0.8.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

sbb_binarization-0.0.8-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file sbb_binarization-0.0.8.tar.gz.

File metadata

  • Download URL: sbb_binarization-0.0.8.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.6.9

File hashes

Hashes for sbb_binarization-0.0.8.tar.gz
Algorithm Hash digest
SHA256 dad75ecc0afd8fef3d0e44ddfb422bbf6bc401dc058d4442f5e427ca86512f05
MD5 748ddcdb8584eb7c44dbede065ffae16
BLAKE2b-256 1d9556c88e9d9c0e9325ecacf1ee95bb6e1f85e67e71783b410cf004df78227e

See more details on using hashes here.

File details

Details for the file sbb_binarization-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: sbb_binarization-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.6.9

File hashes

Hashes for sbb_binarization-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 8740e16c0ea52d8c6f3caa6ae13769d84a171d65c13553d8aacf8e98c3e63fa2
MD5 bb260551a740f451114ef548c5979325
BLAKE2b-256 94ea5f07e3bda0841bbaa88b6acd212395f80f4dc0025dd9e05dcab74e56e04e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page