An extensible viewer for OCR-D workspaces
Project description
OCR-D Browser
An extensible viewer for OCR-D mets.xml files
Screenshot
Installation on Ubuntu 18.04
sudo make deps-ubuntu
pip install browse-ocrd
Usage
browse-ocrd ./path/to/mets.xml # or open interactively
Features
- Browse fileGrps and pages, arranging views next to each other for comparison
- Show original or derived images (
AlternativeImage
on any level of the structural hierarchy) - Show multiple images at once for different pages (horizontally) or different segments (vertically), zooming freely
- Show raw PAGE-XML with syntax highlighting, open with PageViewer
- Show concatenated PAGE-XML text annotation
- Show rendered HTML comparison from dinglehopper evaluations
Configuration
Configuration file locations
At startup the following directories a searched for a config file named ocrd-browser.conf
# directories and their default values under Ubuntu 20.04
GLib.get_system_config_dirs() # '/etc/xdg/xdg-ubuntu/ocrd-browser.conf', '/etc/xdg/ocrd-browser.conf'
GLib.get_user_config_dir() # '/home/jk/.config/ocrd-browser.conf'
os.getcwd() # './ocrd-browser.conf'
Configuration file syntax
The ocrd-browser.conf
file is an ini-file with the following keys:
[FileGroups]
# Preferred fileGrp names for thumbnail display in the Page Browser
# Comma seperated list of regular expressions
preferredImages = OCR-D-IMG, OCR-D-IMG.*, ORIGINAL
# Each Tool has a section header [Tool XYZ]
# At the moment the only defined tool is "PageViewer"
[Tool PageViewer]
# (ba)sh commandline to execute with placeholders
commandline = /usr/bin/java -jar /home/jk/bin/JPageViewer/JPageViewer.jar --resolve-dir {workspace.directory} {file.path.absolute}
The commandline
string will be used as a python format string with the keyword arguments:
workspace
: The currentocrd.Workspace
, all properties get shell escaped (byshlex.quote
) automatically.file
: The currentocrd_models.OcrdFile
, all properties get shell escaped (byshlex.quote
) automatically, also there is an additional propertypath
with the propertiesabsolute
andrelative
, so{file.path.absolute}
will be replaced by the shell quoted absolute path of the file.
Note: You can get PRImA's PageViewer at Github.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
browse-ocrd-0.4.3.tar.gz
(67.1 kB
view details)
Built Distribution
File details
Details for the file browse-ocrd-0.4.3.tar.gz
.
File metadata
- Download URL: browse-ocrd-0.4.3.tar.gz
- Upload date:
- Size: 67.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49ed243b2c557d9eb4fd4d84f9b4de18a0c5623aa8277575d4b5b2bec2d60a27 |
|
MD5 | 9ffa667cf2bcd6aa6b63bd5255e18f0e |
|
BLAKE2b-256 | 9c633717bececcf3bc751cf00e2f8f051cb8cbdb4263744a65e53bfaeec969e7 |
File details
Details for the file browse_ocrd-0.4.3-py3-none-any.whl
.
File metadata
- Download URL: browse_ocrd-0.4.3-py3-none-any.whl
- Upload date:
- Size: 83.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | caf51174f93e4a325438ad3c3f36f4fcb2a76bc7f63cdfaf95eb1b158af4d235 |
|
MD5 | 286c0cf7c6aa8ebb85cf8ee6120114ba |
|
BLAKE2b-256 | 56f47c79b8ca8d15581c885b39dfed6b60da9fc17fa58ddbfd7210886fa14430 |