PDF Thumbnail generation, OCR indexing and extra views integrated with plone.app.async
This package provides some nice integrations for PDF heavy web sites.
OCR requires Ghostscript to be installed and Tesseract. Just you package management to install these packages:
# sudo apt-get install ghostscript tesseract-ocr
This will install tessact 2 not tesseract 3.
Requires svn checkout of tesseract version 3.01 or 3.00 with the hocr configuration in place. Take a look at this thread to find out how to configure hocr http://ubuntuforums.org/showthread.php?t=1647350
In addition, you’ll need exactimage and pdftk installed
# sudo apt-get install exactimage pdftk libtiff-tools
To not use the latest tesseract version to will have to add this in your instances declaration:
environment-vars += AUTHORIZE_OLD_TESSERACT_VERSION true
You can convert all at once by calling the url @@queue-up-all.