Skip to main content

pixel-classifier based page segmentation

Project description

page-segmentation module for OCRd

Introduction

This module implements a page segmentation algorithm based on a Fully Convolutional Network (FCN). The FCN creates a classification for each pixel in a binary image. This result is then segmented per class using XY cuts.

Requirements

  • For GPU-Support: CUDA and CUDNN
  • other requirements are installed via Makefile / pip, see requirements.txt in repository root.

Installation

If you want to use GPU support, set the environment variable TENSORFLOW_GPU, otherwise leave it unset. Then:

make dep

to install dependencies and

make install

to install the package.

Both are python packages installed via pip, so you may want to activate a virtalenv before installing.

Usage

ocrd-pc-segmentation follows the ocrd CLI.

It expects a binary page image and produces region entries in the PageXML file.

Configuration

The following parameters are recognized in the JSON parameter file:

  • overwrite_regions: remove previously existing text regions
  • xheight: height of character "x" in pixels used during training.
  • model: pixel-classifier model path
  • gpu_allow_growth: required for GPU use with some graphic cards (set to true, if you get CUDNN_INTERNAL_ERROR)
  • resize_height: scale down pixelclassifier output to this height before postprocessing. Independent of training / used model. (performance / quality tradeoff, defaults to 300)

Testing

There is a simple CLI test, that will run the tool on a single image from the assets repository.

make test-cli

Training

To train models for the pixel classifier, see its README

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for ocrd-pc-segmentation, version 0.1.3
Filename, size File type Python version Upload date Hashes
Filename, size ocrd_pc_segmentation-0.1.3-py3-none-any.whl (7.6 MB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size ocrd_pc_segmentation-0.1.3.tar.gz (7.5 MB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page