Skip to main content

OCR-D wrapper for Document Image Skew Estimation using Adaptive Radial Projection

Project description

PyPI version

ocrd_jdeskew

OCR-D wrapper for Document Image Skew Estimation using Adaptive Radial Projection

Introduction

This offers an OCR-D compliant workspace processor for jdeskew.

Installation

Create and activate a virtual environment as usual.

To install this module along with its dependencies, do:

pip install .

Usage

OCR-D processor interface ocrd-jdeskew

To be used with PAGE-XML documents in an OCR-D annotation workflow.

Usage: ocrd-jdeskew [OPTIONS]

  Deskew pages / regions with jdeskew

  > Deskew the regions of the workspace.

  > Open and deserialise PAGE input files and their respective images,
  > then iterate over the element hierarchy down to the requested
  > ``level-of-operation``.

  > Next, for each segment, crop an image according to its layout
  > annotation (via coordinates into the higher-level image, or from an
  > existing alternative image), and determine optimal the deskewing
  > angle for it (up to ``maxskew``). Annotate the angle in the page or
  > region.

  > Derotate the image, and add the new image file to the workspace
  > along with the output fileGrp, and using a file ID with suffix
  > ``.IMG-DESKEW`` along with further identification of the segment.

  > Produce a new output file by serialising the resulting hierarchy.

Options for processing:
  -m, --mets URL-PATH             URL or file path of METS to process [./mets.xml]
  -w, --working-dir PATH          Working directory of local workspace [dirname(URL-PATH)]
  -I, --input-file-grp USE        File group(s) used as input
  -O, --output-file-grp USE       File group(s) used as output
  -g, --page-id ID                Physical page ID(s) to process instead of full document []
  --overwrite                     Remove existing output pages/images
                                  (with "--page-id", remove only those)
  --profile                       Enable profiling
  --profile-file PROF-PATH        Write cProfile stats to PROF-PATH. Implies "--profile"
  -p, --parameter JSON-PATH       Parameters, either verbatim JSON string
                                  or JSON file path
  -P, --param-override KEY VAL    Override a single JSON object key-value pair,
                                  taking precedence over "--parameter"
  -l, --log-level [OFF|ERROR|WARN|INFO|DEBUG|TRACE]
                                  Override log level globally [INFO]

Options for Processing Worker server:
  --queue                         The RabbitMQ server address in format
                                  "amqp://{user}:{pass}@{host}:{port}/{vhost}"
                                  [amqp://admin:admin@localhost:5672]
  --database                      The MongoDB server address in format
                                  "mongodb://{host}:{port}"
                                  [mongodb://localhost:27018]
  --type                          type of processing: either "worker" or "server"

Options for information:
  -C, --show-resource RESNAME     Dump the content of processor resource RESNAME
  -L, --list-resources            List names of processor resources
  -J, --dump-json                 Dump tool description as JSON
  -D, --dump-module-dir           Show the 'module' resource location path for this processor
  -h, --help                      Show this message
  -V, --version                   Show version

Parameters:
   "maxskew" [number]
    modulus of maximum skewing angle (in degrees) to detect
   "level-of-operation" [string - "page"]
    PAGE XML hierarchy level granularity to annotate orientation and
    images for
    Possible values: ["page", "region"]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ocrd_jdeskew-0.0.2.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

ocrd_jdeskew-0.0.2-py3-none-any.whl (7.6 kB view details)

Uploaded Python 3

File details

Details for the file ocrd_jdeskew-0.0.2.tar.gz.

File metadata

  • Download URL: ocrd_jdeskew-0.0.2.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.3

File hashes

Hashes for ocrd_jdeskew-0.0.2.tar.gz
Algorithm Hash digest
SHA256 85c0d1b818af4ea06e36ba4df204aee3fca161db426ef5ad9f066dc490966916
MD5 c7e85e2b77a3245e5188b7d6d2fa052b
BLAKE2b-256 6d57683dd558a5ae7d71e14532d4c10142da3ed1577b5ee856bbab860dfd8b94

See more details on using hashes here.

File details

Details for the file ocrd_jdeskew-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for ocrd_jdeskew-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a2aa3154387ee90ea77303dbb49462b7044323bd67b3b9c7d3aeb2ac30373290
MD5 f99513eb066d67be22b8664d8ae49ae7
BLAKE2b-256 309013166cf2cc290a0b6233fc0c4b364b1e9f0f51fc98525a6cf56c24ff3114

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page