Skip to main content

OCR-D framework

Project description

Collection of OCR-related python tools and wrappers from the OCR-D team Docker Automated build


To bootstrap the tool, you’ll need installed (Ubuntu packages):

  • Python (python)
  • pip (python-pip)

To install system-wide:

make deps-ubuntu deps install

To develop, install to a virtualenv

pip install virtualenv
virtualenv --no-site-packages venv
source venv/bin/activate
make deps install


pyocrd installs a binary ocrd that can be used to invoke the processors directly (ocrd process) or start (development) webservices (ocrd server)

TODO: Update docs here.


# List available processors
ocrd process

# Region-segment with tesserocr all files in METS INPUT fileGrp
ocrd process -m /path/to/mets.xml segment-region/tesserocr

# Chain multiple processors
ocrd process -m /path/to/mets.xml characterize/exif segment-line/tesserocr recognize/tesserocr

# Start a processor web service at port 6543
ocrd server process -p 6543
http PUT localhost:6543/characterize url==http://server/path/to/mets.xml


Download assets (make assets)

Test with local files: make test

Test with local asset server:
  • Start asset-server: make asset-server
  • make test OCRD_BASEURL='http://localhost:5001/'
Test with remote assets:
  • make test OCRD_BASEURL=''

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
ocrd-0.13.3-py2-none-any.whl (134.9 kB) Copy SHA256 hash SHA256 Wheel py2
ocrd-0.13.3-py3-none-any.whl (134.9 kB) Copy SHA256 hash SHA256 Wheel py3
ocrd-0.13.3.tar.gz (70.9 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page