Kraken bindings
Project description
ocrd_kraken
OCR-D wrapper for the Kraken OCR engine
Introduction
This package offers OCR-D compliant workspace processors for (some of) the functionality of Kraken.
(Each processor is a parameterizable step in a configurable workflow of the OCR-D functional model. There are usually various alternative processor implementations for each step. Data is represented with METS and PAGE.)
It includes image preprocessing (binarization), layout analysis (region and line+baseline segmentation), and text recognition.
Installation
With Docker
This is the best option if you want to run the software in a container.
You need to have Docker
docker pull ocrd/kraken
To run with Docker:
docker run --rm \
-v path/to/workspaces:/data \
-v path/to/models:/usr/local/share/ocrd-resources \
ocrd/kraken ocrd-kraken-recognize ...
# or ocrd-kraken-segment or ocrd-kraken-binarize
Native, from PyPI
This is the best option if you want to use the stable, released version.
pip install ocrd_kraken
Native, from git
Use this option if you want to change the source code or install the latest, unpublished changes.
We strongly recommend to use venv.
git clone https://github.com/OCR-D/ocrd_kraken
cd ocrd_kraken
sudo make deps-ubuntu # or manually from git or via ocrd_all
make deps # or pip install -r requirements.txt
make install # or pip install .
Models
Kraken uses data-driven (neural) models for segmentation and recognition, but comes with no pretrained "official" models.
There is a public repository of community-provided models, which can also
be queried and downloaded from via kraken
standalone CLI.
(See Kraken docs for details.)
For the OCR-D wrapper, since all OCR-D processors must resolve file/data resources in a standardized way, there is a general mechanism for managing models, i.e. installing and using them by name. We currently manage our own list of recommended models (without delegating to the above repo).
Models always use the filename suffix .mlmodel
, but are just loaded by their basename.
See the OCR-D model guide and
ocrd resmgr --help
Usage
For details, see docstrings in the individual processors and ocrd-tool.json descriptions,
or simply --help
.
Available OCR-D processors are:
- ocrd-kraken-binarize (nlbin – not recommended)
- adds
AlternativeImage
files (per page, region or line) to the output fileGrp
- adds
- ocrd-kraken-segment (all-in-one segmentation – recommended for handwriting and simply layouted prints, or as pure line segmentation)
- adds
TextRegion
s toPage
(iflevel-of-operation=page
) orTableRegion
s (iftable
) - adds
TextLine
s (withBaseline
) toTextRegion
s (for alllevel-of-operation
) - masks existing segments during detection (unless
overwrite_segments
)
- adds
- ocrd-kraken-recognize (benefits from annotated
Baseline
s, falls back to center-normalized bboxes)- adds
Word
s toTextLine
s - adds
Glyph
s toWord
s - adds
TextEquiv
(removing existingTextEquiv
ifoverwrite_text
)
- adds
Testing
make test
This downloads test data from https://github.com/OCR-D/assets under repo/assets
, and runs some basic tests of the Python API.
Set PYTEST_ARGS="-s --verbose"
to see log output (-s
) and individual test results (--verbose
).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for ocrd_kraken-0.4.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 944243f787574774a36490289ab3f9075ffc4c1712eece0e58d59dd383817689 |
|
MD5 | ce9eb4cf0ece67839473dc04878b8dba |
|
BLAKE2b-256 | b72b6fd30e7209f0e61ef4e6e0e333dee71fd9c196501af78d3d91a49d3d5ac5 |