No project description provided
Project description
OCR4All Pixel Classifier
Requirements
Python dependencies are specified in requirements.txt
/ setup.py
.
You must install the package via pip with either ocr4all_pixel_classifier[tf_cpu]
to
use CPU version of tensorflow or ocr4all_pixel_classifier[tf_gpu]
to use GPU (CUDA)
version of tensorflow. For the latter, your system should be set up with CUDA 9
and CuDNN 7.
Usage
For training and direct usage, install ocr4all-pixel-classifier-frontend
. This package only contains the library code.
Pixel classifier
Classification
To run a model on some input images, use ocr4all-pixel-classifier predict
:
ocr4all-pixel-classifier predict --load PATH_TO_MODEL \
--output OUTPUT_PATH \
--binary PATH_TO_BINARY_IMAGES \
--images PATH_TO_SOURCE_IMAGES \
--norm PATH_TO_NORMALIZATIONS
(ocr4all-pixel-classifier
is an alias for ocr4all-pixel-classifier predict
)
This will create three folders at the output path:
color
: the classification as color image, with pixel color corresponding to the class for that pixelinverted
: inverted binary image with classification of foreground pixels only (i.e. background is black, foreground is white or class color)overlay
: classification image layered transparently over the original image
Training
For training, you first have to create dataset files. A dataset file is a JSON file containing three arrays, for train, test and evaluation data (also called train/validation/test in other publications). The JSON file uses the following format:
{
"train": [
//datasets here
],
"test": [
//datasets here
],
"eval": [
//datasets here
]
}
A dataset describes a single input image and consists of several paths: the original image, a binarized version and the mask (pixel color corresponds to class). Furthermore, the line height of the page in pixels must be specified:
{
"binary_path": "/path/to/image/binary/filename.bin.png",
"image_path": "/path/to/image/color/filename.jpg",
"mask_path": "/path/to/image/mask/filename_MASK.png",
"line_height_px": 18
}
The generation of dataset files can be automated using ocr4all-pixel-classifier create-dataset-file
. Refer to the command's --help
output for further
information.
To start the training:
ocr4all-pixel-classifier train \
--train DATASET_FILE.json --test DATASET_FILE.json --eval DATASET_FILE.json \
--output MODEL_TARGET_PATH \
--n_iter 5000
The parameters --train
, --test
and --eval
may be followed by any number of
dataset files or patterns (shell globbing).
Refer to ocr4all-pixel-classifier train --help
for further parameters provided to
affect the training procedure.
You can combine several dataset files into a split file. The format of the split file is:
{
"label": "name of split",
"train": [
"/path/to/dataset1.json",
"/path/to/dataset2.json",
...
],
"test": [
//dataset paths here
],
"eval": [
//dataset paths here
]
}
To use a split file, add the --split_file
parameter.
Examples
See the examples for dataset generation and training
ocr4all-pixel-classifier compute-image-normalizations
/ ocrd_compute_normalizations
Calculate image normalizations, i.e. scaling factors based on average line height.
Required arguments:
--input_dir
: location of images--output_dir
: target location of norm files
Optional arguments:
--average_all
: Average height over all images--inverse
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ocr4all_pixel_classifier-0.5.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5cffec1a6ebd7a5ddcbd0a1bf6782a5d71f84606cbd192a411984c9a16cfc27 |
|
MD5 | eeaa310d16d216cd6855de2af7535234 |
|
BLAKE2b-256 | 2d3c22ee98062a533094a33c874f43fb66841f84f58d28b14677d00a3c6f49b6 |
Hashes for ocr4all_pixel_classifier-0.5.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 441554d96073a41fe2b001a6f40e13087747b76c83940b8b4fc36398f3dffb9a |
|
MD5 | b9611678dc1e0fa9b2160b913df052e8 |
|
BLAKE2b-256 | 2868b5894294da3ec454db8c0d70297488f71f26e12137d681ff892d3e9c7d80 |