Skip to main content

Pixelwise binarization with selectional auto-encoders in Keras

Project description

Binarization

Binarization for document images

Examples

Introduction

This tool performs document image binarization (i.e. transform colour/grayscale to black-and-white pixels) for OCR using multiple trained models.

The method used is based on Calvo-Zaragoza/Gallego, 2018. A selectional auto-encoder approach for document image binarization.

Installation

Clone the repository, enter it and run

pip install .

Models

Pre-trained models can be downloaded from here:

https://qurator-data.de/sbb_binarization/

Usage

sbb_binarize \
  --patches \
  -m <directory with models> \
  <input image> \
  <output image>

Note In virtually all cases, the --patches flag will improve results.

To use the OCR-D interface:

ocrd-sbb-binarize --overwrite -I INPUT_FILE_GRP -O OCR-D-IMG-BIN -P model "/var/lib/sbb_binarization"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sbb_binarization-0.0.7.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

sbb_binarization-0.0.7-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file sbb_binarization-0.0.7.tar.gz.

File metadata

  • Download URL: sbb_binarization-0.0.7.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.6.9

File hashes

Hashes for sbb_binarization-0.0.7.tar.gz
Algorithm Hash digest
SHA256 f9883e5b392e36df567b565158f55835ee6e9237ce695ff275ea2b6bf2dcf226
MD5 8209d3eb39d26ddf318154ccf5b9b454
BLAKE2b-256 e3f9f218b6b66c40dfd601f1d8e8e3f5fda2e01aab29e1cacd3645ab9957b7e2

See more details on using hashes here.

File details

Details for the file sbb_binarization-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: sbb_binarization-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.59.0 CPython/3.6.9

File hashes

Hashes for sbb_binarization-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 73b4f4329447ccbe6163aeabf37cf21d63a3f3a63d17b46d48ea06ae6ebffd3b
MD5 03a1416cdebdac563f2bbd4f6b09239d
BLAKE2b-256 ab6e71c09c580b8e28d821741060f6edf33eb69afb344bf21d7361c83a20185c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page