pixel-classifier based page segmentation
Project description
page-segmentation module for OCR-d
Introduction
This module implements a page segmentation algorithm based on a Fully Convolutional Network (FCN). The FCN creates a classification for each pixel in a binary image. This result is then segmented per class using XY cuts.
Requirements
- For GPU-Support: CUDA and CUDNN
- other requirements are installed via Makefile / pip, see
requirements.txt
in repository root.
Installation
If you want to use GPU support, set the environment variable TENSORFLOW_GPU
to a nonempty value, otherwise leave it unset. Then:
make deps
to install dependencies and
make install
to install the package.
Both are python packages installed via pip, so you may want to activate a virtalenv before installing.
Usage
ocrd-pc-segmentation
follows the ocrd CLI.
It expects a binary page image and produces region entries in the PageXML file.
Configuration
The following parameters are recognized in the JSON parameter file:
overwrite_regions
: remove previously existing text regionsxheight
: height of character "x" in pixels used during training.model
: pixel-classifier model path. The special values__DEFAULT__
and__LEGACY__
load the bundled default model or previous default model respectively.gpu_allow_growth
: required for GPU use with some graphic cards (set to true, if you get CUDNN_INTERNAL_ERROR)resize_height
: scale down pixelclassifier output to this height before postprocessing. Independent of training / used model. (performance / quality tradeoff, defaults to 300)
Testing
There is a simple CLI test, that will run the tool on a single image from the assets repository.
make test-cli
Training
To train models for the pixel classifier, see its README
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ocrd_pc_segmentation-0.2.3.tar.gz
.
File metadata
- Download URL: ocrd_pc_segmentation-0.2.3.tar.gz
- Upload date:
- Size: 15.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b6818be8a58709c07610a18069c77db3e37ddbe4a27acb5fedd45ebd14612c5 |
|
MD5 | a00102e248c170ca6fce50b24dc2369d |
|
BLAKE2b-256 | c8e945889b7724f4ac06f8c34b7a73784f00acbd99e8c8a4a6eeca794d7155cb |
File details
Details for the file ocrd_pc_segmentation-0.2.3-py3-none-any.whl
.
File metadata
- Download URL: ocrd_pc_segmentation-0.2.3-py3-none-any.whl
- Upload date:
- Size: 15.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 12f7bc0ece78dd7df7e0816e01013885ad458239902c560410c367008bf48fa2 |
|
MD5 | d7cde4b607b6e910ce4535b19d6613c5 |
|
BLAKE2b-256 | 500a541404a7d0cb17a380a9d7cdb4b23e1848cc21e60ac53e75f1154c430c2f |