Skip to main content

No project description provided

Project description

OCR4All Pixel Classifier

Requirements

Python dependencies are specified in requirements.txt / setup.py.

The package is tested with Tensorflow 2.0 up to 2.5. If you want to use a GPU, you'll have to set up your system with the CUDA and CuDNN versions matching your used Tensorflow version. If using Tensorflow older than 2.1 for some reason, you'll additionaly have to replace the tensorflow package with tensorflow-gpu manually.

Usage

For training and direct usage, install ocr4all-pixel-classifier-frontend. This package only contains the library code.

Pixel classifier

Classification

To run a model on some input images, use ocr4all-pixel-classifier predict:

ocr4all-pixel-classifier predict --load PATH_TO_MODEL \
	--output OUTPUT_PATH \
	--binary PATH_TO_BINARY_IMAGES \
	--images PATH_TO_SOURCE_IMAGES \
	--norm PATH_TO_NORMALIZATIONS

(ocr4all-pixel-classifier is an alias for ocr4all-pixel-classifier predict)

This will create three folders at the output path:

  • color: the classification as color image, with pixel color corresponding to the class for that pixel
  • inverted: inverted binary image with classification of foreground pixels only (i.e. background is black, foreground is white or class color)
  • overlay: classification image layered transparently over the original image

Training

For training, you first have to create dataset files. A dataset file is a JSON file containing three arrays, for train, test and evaluation data (also called train/validation/test in other publications). The JSON file uses the following format:

{
	"train": [
		//datasets here
	],
	"test": [
		//datasets here
	],
	"eval": [
		//datasets here
	]
}

A dataset describes a single input image and consists of several paths: the original image, a binarized version and the mask (pixel color corresponds to class). Furthermore, the line height of the page in pixels must be specified:

{
	"binary_path": "/path/to/image/binary/filename.bin.png",
	"image_path":  "/path/to/image/color/filename.jpg",
	"mask_path":  "/path/to/image/mask/filename_MASK.png",
	"line_height_px": 18
}

The generation of dataset files can be automated using ocr4all-pixel-classifier create-dataset-file. Refer to the command's --help output for further information.

To start the training:

ocr4all-pixel-classifier train \
    --train DATASET_FILE.json --test DATASET_FILE.json --eval DATASET_FILE.json \
    --output MODEL_TARGET_PATH \
    --n_iter 5000

The parameters --train, --test and --eval may be followed by any number of dataset files or patterns (shell globbing).

Refer to ocr4all-pixel-classifier train --help for further parameters provided to affect the training procedure.

You can combine several dataset files into a split file. The format of the split file is:

{
	"label": "name of split",
	"train": [
		"/path/to/dataset1.json",
		"/path/to/dataset2.json",
		...
	],
	"test": [
		//dataset paths here
	],
	"eval": [
		//dataset paths here
	]
}

To use a split file, add the --split_file parameter.

Examples

See the examples for dataset generation and training

ocr4all-pixel-classifier compute-image-normalizations / ocrd_compute_normalizations

Calculate image normalizations, i.e. scaling factors based on average line height.

Required arguments:

  • --input_dir: location of images
  • --output_dir: target location of norm files

Optional arguments:

  • --average_all: Average height over all images
  • --inverse

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocr4all_pixel_classifier-0.6.6.tar.gz (48.1 kB view details)

Uploaded Source

Built Distribution

ocr4all_pixel_classifier-0.6.6-py3-none-any.whl (54.4 kB view details)

Uploaded Python 3

File details

Details for the file ocr4all_pixel_classifier-0.6.6.tar.gz.

File metadata

  • Download URL: ocr4all_pixel_classifier-0.6.6.tar.gz
  • Upload date:
  • Size: 48.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for ocr4all_pixel_classifier-0.6.6.tar.gz
Algorithm Hash digest
SHA256 0afdf5939417e8986b4993ae522a65e35dac5e57586fe8a071fe2b96b0e17e73
MD5 9dace8077b1f37b5aaf8066dc99d2a5b
BLAKE2b-256 95b5eb9e49243bd1a6032068982696cf3bedd949713f62b139a2d496eabdc301

See more details on using hashes here.

File details

Details for the file ocr4all_pixel_classifier-0.6.6-py3-none-any.whl.

File metadata

  • Download URL: ocr4all_pixel_classifier-0.6.6-py3-none-any.whl
  • Upload date:
  • Size: 54.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.8.12

File hashes

Hashes for ocr4all_pixel_classifier-0.6.6-py3-none-any.whl
Algorithm Hash digest
SHA256 f3b16d8f9c0ecf86c9f1207f2cd2be323272b486522a64459055fd505afe1c0a
MD5 f1e480cfcb5bdf87d9305512b16e3b73
BLAKE2b-256 2143e6c20ca22c714c5bc2e0330f53b7239f7e2b72c5547f6f33ae446c114913

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page