No project description provided
Project description
OCR4All Pixel Classifier
Requirements
Python dependencies are specified in requirements.txt
/ setup.py
.
You must install the package via pip with either ocr4all_pixel_classifier[tf_cpu]
to
use CPU version of tensorflow or ocr4all_pixel_classifier[tf_gpu]
to use GPU (CUDA)
version of tensorflow. For the latter, your system should be set up with CUDA 9
and CuDNN 7.
Usage
Pixel classifier
Classification
To run a model on some input images, use ocr4all-pixel-classifier predict
:
ocr4all-pixel-classifier predict --load PATH_TO_MODEL \
--output OUTPUT_PATH \
--binary PATH_TO_BINARY_IMAGES \
--images PATH_TO_SOURCE_IMAGES \
--norm PATH_TO_NORMALIZATIONS
(ocr4all-pixel-classifier
is an alias for ocr4all-pixel-classifier predict
)
This will create three folders at the output path:
color
: the classification as color image, with pixel color corresponding to the class for that pixelinverted
: inverted binary image with classification of foreground pixels only (i.e. background is black, foreground is white or class color)overlay
: classification image layered transparently over the original image
Training
For training, you first have to create dataset files. A dataset file is a JSON file containing three arrays, for train, test and evaluation data (also called train/validation/test in other publications). The JSON file uses the following format:
{
"train": [
//datasets here
],
"test": [
//datasets here
],
"eval": [
//datasets here
]
}
A dataset describes a single input image and consists of several paths: the original image, a binarized version and the mask (pixel color corresponds to class). Furthermore, the line height of the page in pixels must be specified:
{
"binary_path": "/path/to/image/binary/filename.bin.png",
"image_path": "/path/to/image/color/filename.jpg",
"mask_path": "/path/to/image/mask/filename_MASK.png",
"line_height_px": 18
}
The generation of dataset files can be automated using ocr4all-pixel-classifier create-dataset-file
. Refer to the command's --help
output for further
information.
To start the training:
ocr4all-pixel-classifier train \
--train DATASET_FILE.json --test DATASET_FILE.json --eval DATASET_FILE.json \
--output MODEL_TARGET_PATH \
--n_iter 5000
The parameters --train
, --test
and --eval
may be followed by any number of
dataset files or patterns (shell globbing).
Refer to ocr4all-pixel-classifier train --help
for further parameters provided to
affect the training procedure.
You can combine several dataset files into a split file. The format of the split file is:
{
"label": "name of split",
"train": [
"/path/to/dataset1.json",
"/path/to/dataset2.json",
...
],
"test": [
//dataset paths here
],
"eval": [
//dataset paths here
]
}
To use a split file, add the --split_file
parameter.
ocr4all-pixel-classifier compute-image-normalizations
/ ocrd_compute_normalizations
Calculate image normalizations, i.e. scaling factors based on average line height.
Required arguments:
--input_dir
: location of images--output_dir
: target location of norm files
Optional arguments:
--average_all
: Average height over all images--inverse
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ocr4all_pixel_classifier-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7a4b59aa03edecec0c80d896480ea428c754b59225d0f96f288c465eb8f11d2f |
|
MD5 | eaf66e5e2455c3edf809009128255005 |
|
BLAKE2b-256 | 4468db07cb0f6d1c50f0b8ad64d84bcaca774b9db5a9325cd4e6a95ecd75e67a |
Hashes for ocr4all_pixel_classifier-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fedcf2ee27b7bbbb2deec2595239e23e94f8042b058e25e455b97f3378330cb |
|
MD5 | c2a1877118134fc4505b4c218f4f4a06 |
|
BLAKE2b-256 | 146c746583650e0db4ff7a41d74d7eb20e163c87afaecb5991c5c778700b2713 |