pixel-classifier based page segmentation
Project description
page-segmentation module for OCR-d
Introduction
This module implements a page segmentation algorithm based on a Fully Convolutional Network (FCN). The FCN creates a classification for each pixel in a binary image. This result is then segmented per class using XY cuts.
Requirements
- For GPU-Support: CUDA and CUDNN
- other requirements are installed via Makefile / pip, see
requirements.txtin repository root.
Installation
If you want to use GPU support, set the environment variable TENSORFLOW_GPU
to a nonempty value, otherwise leave it unset. Then:
make deps
to install dependencies and
make install
to install the package.
Both are python packages installed via pip, so you may want to activate a virtalenv before installing.
Usage
ocrd-pc-segmentation follows the ocrd CLI.
It expects a binary page image and produces region entries in the PageXML file.
Configuration
The following parameters are recognized in the JSON parameter file:
overwrite_regions: remove previously existing text regionsxheight: height of character "x" in pixels used during training.model: pixel-classifier model path. The special values__DEFAULT__and__LEGACY__load the bundled default model or previous default model respectively.gpu_allow_growth: required for GPU use with some graphic cards (set to true, if you get CUDNN_INTERNAL_ERROR)resize_height: scale down pixelclassifier output to this height before postprocessing. Independent of training / used model. (performance / quality tradeoff, defaults to 300)
Testing
There is a simple CLI test, that will run the tool on a single image from the assets repository.
make test-cli
Training
To train models for the pixel classifier, see its README
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ocrd_pc_segmentation-0.2.3.tar.gz.
File metadata
- Download URL: ocrd_pc_segmentation-0.2.3.tar.gz
- Upload date:
- Size: 15.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b6818be8a58709c07610a18069c77db3e37ddbe4a27acb5fedd45ebd14612c5
|
|
| MD5 |
a00102e248c170ca6fce50b24dc2369d
|
|
| BLAKE2b-256 |
c8e945889b7724f4ac06f8c34b7a73784f00acbd99e8c8a4a6eeca794d7155cb
|
File details
Details for the file ocrd_pc_segmentation-0.2.3-py3-none-any.whl.
File metadata
- Download URL: ocrd_pc_segmentation-0.2.3-py3-none-any.whl
- Upload date:
- Size: 15.0 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12f7bc0ece78dd7df7e0816e01013885ad458239902c560410c367008bf48fa2
|
|
| MD5 |
d7cde4b607b6e910ce4535b19d6613c5
|
|
| BLAKE2b-256 |
500a541404a7d0cb17a380a9d7cdb4b23e1848cc21e60ac53e75f1154c430c2f
|