Skip to main content

Pixelwise binarization with selectional auto-encoders in Keras

Project description

sbb_binarization

Document Image Binarization using pre-trained models

pip release CircleCI test GHActions Tests

Examples

Installation

Python versions 3.7-3.10 are currently supported.

You can either install via

pip install sbb-binarization

or clone the repository, enter it and install (editable) with

git clone git@github.com:qurator-spk/sbb_binarization.git
cd sbb_binarization; pip install -e .

Models

Pre-trained models can be downloaded from the locations below. We also provide the models and model card on 🤗

Version Format Download
2021-03-09 SavedModel https://github.com/qurator-spk/sbb_binarization/releases/download/v0.0.11/saved_model_2021_03_09.zip
2021-03-09 HDF5 https://qurator-data.de/sbb_binarization/2021-03-09/models.tar.gz
2020-01-16 SavedModel https://github.com/qurator-spk/sbb_binarization/releases/download/v0.0.11/saved_model_2020_01_16.zip
2020-01-16 HDF5 https://qurator-data.de/sbb_binarization/2020-01-16/models.tar.gz

With OCR-D, you can use the Resource Manager to deploy models, e.g.

ocrd resmgr download ocrd-sbb-binarize "*"

Usage

sbb_binarize \
  -m <path to directory containing model files> \
  <input image> \
  <output image>

Note: the output image MUST use either .tif or .png as file extension to produce a binary image. Input images can also be JPEG.

Images containing a lot of border noise (black pixels) should be cropped beforehand to improve the quality of results.

Example

sbb_binarize -m /path/to/model/ myimage.tif myimage-bin.tif

To use the OCR-D interface:

ocrd-sbb-binarize -I INPUT_FILE_GRP -O OCR-D-IMG-BIN -P model default

Testing

For simple smoke tests, the following will

  • download models

  • download test data

  • run the OCR-D wrapper (on page and region level):

      make models
      make test
    

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sbb_binarization-0.1.0.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

sbb_binarization-0.1.0-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file sbb_binarization-0.1.0.tar.gz.

File metadata

  • Download URL: sbb_binarization-0.1.0.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.16

File hashes

Hashes for sbb_binarization-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d02749211421744c74e4c45c712fe32b3f1c20864e42f7d0658dafae89322d3d
MD5 d5c63e684ce14008a13bb063ac45ef77
BLAKE2b-256 2bb9f74633c4990623773eba6fbe36b13dd9bb95fa2330f683f2ad35c04963ae

See more details on using hashes here.

File details

Details for the file sbb_binarization-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sbb_binarization-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 adb96a25a32924ce3184017796f41df3630ae3c894e95148d693226304422d07
MD5 fe47df4bc588ea82f65b9c3a96e61f81
BLAKE2b-256 204a3b6af5a33ae34a3965bbbb77d1c82e81d42990b46c3920a45199b75b06e9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page