pixel-classifier based page segmentation
page-segmentation module for OCRd
This module implements a page segmentation algorithm based on a Fully Convolutional Network (FCN). The FCN creates a classification for each pixel in a binary image. This result is then segmented per class using XY cuts.
- For GPU-Support: CUDA and CUDNN
- other requirements are installed via Makefile / pip, see
requirements.txtin repository root.
If you want to use GPU support, set the environment variable
otherwise leave it unset. Then:
to install dependencies and
to install the package.
Both are python packages installed via pip, so you may want to activate a virtalenv before installing.
ocrd-pc-segmentation follows the ocrd CLI.
It expects a binary page image and produces region entries in the PageXML file.
The following parameters are recognized in the JSON parameter file:
overwrite_regions: remove previously existing text regions
xheight: height of character "x" in pixels used during training.
model: pixel-classifier model path
gpu_allow_growth: required for GPU use with some graphic cards (set to true, if you get CUDNN_INTERNAL_ERROR)
resize_height: scale down pixelclassifier output to this height before postprocessing. Independent of training / used model. (performance / quality tradeoff, defaults to 300)
There is a simple CLI test, that will run the tool on a single image from the assets repository.
To train models for the pixel classifier, see its README
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size ocrd_pc_segmentation-0.1.3-py3-none-any.whl (7.6 MB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size ocrd_pc_segmentation-0.1.3.tar.gz (7.5 MB)||File type Source||Python version None||Upload date||Hashes View hashes|
Hashes for ocrd_pc_segmentation-0.1.3-py3-none-any.whl
Hashes for ocrd_pc_segmentation-0.1.3.tar.gz