Skip to main content

Pixelwise binarization with selectional auto-encoders in Keras

Project description

sbb_binarization

Document Image Binarization using pre-trained models

pip release CircleCI test GHActions Tests

Examples

Installation

Python versions 3.7-3.10 are currently supported.

You can either install via

pip install sbb-binarization

or clone the repository, enter it and install (editable) with

git clone git@github.com:qurator-spk/sbb_binarization.git
cd sbb_binarization; pip install -e .

Models

Pre-trained models can be downloaded from the locations below. We also provide the models and model card on 🤗

Version Format Download
2021-03-09 SavedModel https://github.com/qurator-spk/sbb_binarization/releases/download/v0.0.11/saved_model_2021_03_09.zip
2021-03-09 HDF5 https://qurator-data.de/sbb_binarization/2021-03-09/models.tar.gz
2020-01-16 SavedModel https://github.com/qurator-spk/sbb_binarization/releases/download/v0.0.11/saved_model_2020_01_16.zip
2020-01-16 HDF5 https://qurator-data.de/sbb_binarization/2020-01-16/models.tar.gz

With OCR-D, you can use the Resource Manager to deploy models, e.g.

ocrd resmgr download ocrd-sbb-binarize "*"

Usage

sbb_binarize \
  -m <path to directory containing model files> \
  <input image> \
  <output image>

Note: the output image MUST use either .tif or .png as file extension to produce a binary image. Input images can also be JPEG.

Images containing a lot of border noise (black pixels) should be cropped beforehand to improve the quality of results.

Example

sbb_binarize -m /path/to/model/ myimage.tif myimage-bin.tif

To use the OCR-D interface:

ocrd-sbb-binarize -I INPUT_FILE_GRP -O OCR-D-IMG-BIN -P model default

Testing

For simple smoke tests, the following will

  • download models

  • download test data

  • run the OCR-D wrapper (on page and region level):

      make models
      make test
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sbb_binarization-0.1.0.tar.gz (11.9 kB view hashes)

Uploaded Source

Built Distribution

sbb_binarization-0.1.0-py3-none-any.whl (13.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page