Skip to main content

pixel-classifier based page segmentation

Project description

page-segmentation module for OCR-d

Introduction

This module implements a page segmentation algorithm based on a Fully Convolutional Network (FCN). The FCN creates a classification for each pixel in a binary image. This result is then segmented per class using XY cuts.

Requirements

  • For GPU-Support: CUDA and CUDNN
  • other requirements are installed via Makefile / pip, see requirements.txt in repository root.

Installation

If you want to use GPU support, set the environment variable TENSORFLOW_GPU to a nonempty value, otherwise leave it unset. Then:

make deps

to install dependencies and

make install

to install the package.

Both are python packages installed via pip, so you may want to activate a virtalenv before installing.

Usage

ocrd-pc-segmentation follows the ocrd CLI.

It expects a binary page image and produces region entries in the PageXML file.

Configuration

The following parameters are recognized in the JSON parameter file:

  • overwrite_regions: remove previously existing text regions
  • xheight: height of character "x" in pixels used during training.
  • model: pixel-classifier model path. The special values __DEFAULT__ and __LEGACY__ load the bundled default model or previous default model respectively.
  • gpu_allow_growth: required for GPU use with some graphic cards (set to true, if you get CUDNN_INTERNAL_ERROR)
  • resize_height: scale down pixelclassifier output to this height before postprocessing. Independent of training / used model. (performance / quality tradeoff, defaults to 300)

Testing

There is a simple CLI test, that will run the tool on a single image from the assets repository.

make test-cli

Training

To train models for the pixel classifier, see its README

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocrd_pc_segmentation-0.2.3.tar.gz (15.0 MB view details)

Uploaded Source

Built Distribution

ocrd_pc_segmentation-0.2.3-py3-none-any.whl (15.0 MB view details)

Uploaded Python 3

File details

Details for the file ocrd_pc_segmentation-0.2.3.tar.gz.

File metadata

  • Download URL: ocrd_pc_segmentation-0.2.3.tar.gz
  • Upload date:
  • Size: 15.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.7

File hashes

Hashes for ocrd_pc_segmentation-0.2.3.tar.gz
Algorithm Hash digest
SHA256 0b6818be8a58709c07610a18069c77db3e37ddbe4a27acb5fedd45ebd14612c5
MD5 a00102e248c170ca6fce50b24dc2369d
BLAKE2b-256 c8e945889b7724f4ac06f8c34b7a73784f00acbd99e8c8a4a6eeca794d7155cb

See more details on using hashes here.

File details

Details for the file ocrd_pc_segmentation-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: ocrd_pc_segmentation-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 15.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.7

File hashes

Hashes for ocrd_pc_segmentation-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 12f7bc0ece78dd7df7e0816e01013885ad458239902c560410c367008bf48fa2
MD5 d7cde4b607b6e910ce4535b19d6613c5
BLAKE2b-256 500a541404a7d0cb17a380a9d7cdb4b23e1848cc21e60ac53e75f1154c430c2f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page