No project description provided
Project description
OCR4All Pixel Classifier
Requirements
Python dependencies are specified in requirements.txt
/ setup.py
.
You must install the package via pip with either ocr4all_pixel_classifier[tf_cpu]
to
use CPU version of tensorflow or ocr4all_pixel_classifier[tf_gpu]
to use GPU (CUDA)
version of tensorflow. For the latter, your system should be set up with CUDA 9
and CuDNN 7.
Usage
Pixel classifier
Classification
To run a model on some input images, use ocr4all-pixel-classifier predict
:
ocr4all-pixel-classifier predict --load PATH_TO_MODEL \
--output OUTPUT_PATH \
--binary PATH_TO_BINARY_IMAGES \
--images PATH_TO_SOURCE_IMAGES \
--norm PATH_TO_NORMALIZATIONS
(ocr4all-pixel-classifier
is an alias for ocr4all-pixel-classifier predict
)
This will create three folders at the output path:
color
: the classification as color image, with pixel color corresponding to the class for that pixelinverted
: inverted binary image with classification of foreground pixels only (i.e. background is black, foreground is white or class color)overlay
: classification image layered transparently over the original image
Training
For training, you first have to create dataset files. A dataset file is a JSON file containing three arrays, for train, test and evaluation data (also called train/validation/test in other publications). The JSON file uses the following format:
{
"train": [
//datasets here
],
"test": [
//datasets here
],
"eval": [
//datasets here
]
}
A dataset describes a single input image and consists of several paths: the original image, a binarized version and the mask (pixel color corresponds to class). Furthermore, the line height of the page in pixels must be specified:
{
"binary_path": "/path/to/image/binary/filename.bin.png",
"image_path": "/path/to/image/color/filename.jpg",
"mask_path": "/path/to/image/mask/filename_MASK.png",
"line_height_px": 18
}
The generation of dataset files can be automated using ocr4all-pixel-classifier create-dataset-file
. Refer to the command's --help
output for further
information.
To start the training:
ocr4all-pixel-classifier train \
--train DATASET_FILE.json --test DATASET_FILE.json --eval DATASET_FILE.json \
--output MODEL_TARGET_PATH \
--n_iter 5000
The parameters --train
, --test
and --eval
may be followed by any number of
dataset files or patterns (shell globbing).
Refer to ocr4all-pixel-classifier train --help
for further parameters provided to
affect the training procedure.
You can combine several dataset files into a split file. The format of the split file is:
{
"label": "name of split",
"train": [
"/path/to/dataset1.json",
"/path/to/dataset2.json",
...
],
"test": [
//dataset paths here
],
"eval": [
//dataset paths here
]
}
To use a split file, add the --split_file
parameter.
Examples
See the examples for dataset generation and training
ocr4all-pixel-classifier compute-image-normalizations
/ ocrd_compute_normalizations
Calculate image normalizations, i.e. scaling factors based on average line height.
Required arguments:
--input_dir
: location of images--output_dir
: target location of norm files
Optional arguments:
--average_all
: Average height over all images--inverse
Versioning
Major and minor versions are usually identical to the ones of the used
ocr4all-pixel-classifier
library. As this package is a frontend package and not
intended for use as library, no guarantees are made regarding API stability
between versions. Use ocr4all-pixel-classifier
in this case.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ocr4all_pixel_classifier_frontend-0.6.2.tar.gz
.
File metadata
- Download URL: ocr4all_pixel_classifier_frontend-0.6.2.tar.gz
- Upload date:
- Size: 18.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.28.1 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.64.1 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a24ffc5995ae2d4f3b4c90a434dc9fb5cd3d619c369c5ba7f7350d48bae9c36d |
|
MD5 | 5c24cd7128a24136fe835c406cee4251 |
|
BLAKE2b-256 | a6d0b5d02458c8f73c50aba88d7550d0cdd48ba5bf968984310cfe4e361781ea |
File details
Details for the file ocr4all_pixel_classifier_frontend-0.6.2-py3-none-any.whl
.
File metadata
- Download URL: ocr4all_pixel_classifier_frontend-0.6.2-py3-none-any.whl
- Upload date:
- Size: 41.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.28.1 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.64.1 CPython/3.7.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cc597bc7e72e5cb877e484281a7c5133d7e295ca9c8fa0f4c14ddd84067b68d9 |
|
MD5 | 4aee562bdc67e7aec1fa929c9079a8c7 |
|
BLAKE2b-256 | 6820f1f1e373133f46dfbc8453d657df56e5b974c39cea9f1b43a8618f8fc77d |