collective.documentviewer

Document cloud's document viewer integration into plone.

These details have not been verified by PyPI

Project links

Homepage

Project description

Introduction

This package integrates documentcloud’s viewer and pdf processing into plone.

Example viewer: https://www.documentcloud.org/documents/19864-goldman-sachs-internal-emails

Features

very nice document viewer
OCR
Searchable on OCR text
works with many different document types
plone.app.async integration with task monitor
lots of configuration options
PDF Album view for display groups of PDFs

Works with

Besides displaying PDFs, it will also display:

Word
Excel
Powerpoint
HTML
RTF

Install requirements

Docsplit: http://documentcloud.github.com/docsplit/
GraphicsMagick
ghostscript (version 9.0 preferred)
Poppler
tesseract (optional)
pdftk (optional)
OpenOffice or LibreOffice (optional, for doc, excel, ppt, etc. types)
md5 or md5sum command line tool

Async Integration

It it highly recommended to install and configure plone.app.async in combination with this package. Doing so will manage all pdf conversions processes asynchronously so the user isn’t delayed so much when saving files.

Settings

The product can be configured via a control panel item Document Viewer Settings.

Some interesting configuration options:

Storage Type: If you want to be able to serve you files via amazon cloud, this will allow you to store the data in flat files that can be synced to another server.
Storage Location: Where are the server to store the files.
OCR: Use tesseract to scan the document for text. This process ca be slow so if your pdfs do not need to be OCR’d, you may disable.
Auto Select Layout: For pdf files added to the site, automatically select the document viewer display.
Auto Convert: When pdf files are added and modified, automatically convert.
Auto layout file types: Types that should automatically be converted to document viewer

Dexterity support

If you want to use it with your own Dexterity content type. You need to edit the fti in ZMI/portal_types/yourtype to add “documentviewer” in the available view methods and to set the primary field in the schema, for example:

<field name="myfile" marshal:primary="true"
       type="plone.namedfile.field.NamedBlobFile">

File storage integration

If you choose to use basic file storage instead of zodb blob storage, there are a few things you’ll want to keep in mind.

Use nginx to then serve the file system files. This might require you install a local nginx just for serving file storage on the plone server. You can get creative with how your file storage is used though.
Since in plone’s delete operation, it can be interrupted and the deletion of a file on the OS system system can not be done within a transaction, no files are ever deleted. However, there is an action you can put in a cron task to clean up your file storage directory. Just call the url http://zeoinstace/plone/@@dvcleanup-filestorage.

Upgrading from page turner

If you currently have page turner installed, this project will supercede it. Your page turner views will work but no future files added to the site will be converted to page turner.

To convert existing view, on every page turner enabled file, there will be a button Document Viewer Convert that you can click to manually convert page turner to document viewer.

To convert all existing views, go to portal_setup in the zmi, upgrades, select collective.documentviewer, click to show old upgrades and there should be an upgrade-all step to run.

Installation on Cent OS/Red hat

Special instructions for centos have been contributed by Eric Tyrer. You can access them via the git hub repo file location.

Installation

If on a linux/ubuntu/debian machine you run into an error like:

/var/lib/gems/1.9.1/gems/docsplit-0.7.2/lib/docsplit/image_extractor.rb:51:in `exists?': can't convert nil into String (TypeError)
from /var/lib/gems/1.9.1/gems/docsplit-0.7.2/lib/docsplit/image_extractor.rb:51:in `ensure in convert'

This is because the ruby docsplit library is having an issue with the temp folder accesses, and removal of temp files. Just run the following command:

sudo chmod 1777 /tmp && sudo chmod 1777 /var/tmp

And retry the conversion of your document

TODO

check why there are some error during async operations:
- ConflictError: database conflict error (oid 0x4d10, class BTrees.IOBTree.IOBucket, serial this txn started with 0x0395f478bc2cb377 2012-04-21 03:36:44.103425, serial currently committed 0x0395f479b09de4cc 2012-04-21 03:37:41.394556)
- ERROR ZODB.Connection Shouldn’t load state for 0x319d when the connection is closed

Changelog

4.0.4 (2016-01-25)

fix celery conversation showing that it is still converting [vangheem]

4.0.3 (2015-09-30)

fix import of namedfile, restores older plone compatibility [vangheem]

4.0.2 (2015-09-30)

fix support for archetypes [vangheem]

4.0.1 (2015-09-28)

add lead image support [vangheem]
be able to use collective.celery for queuing tasks [vangheem]
fix async monitor registration [pilz]

4.0.0 (2015-09-09)

fix Plone 5 compatibility [vangheem]
upgrade jquery.imgareaselect to latest [vangheem]
upgrade document viewer to latest [vangheem]
do not support upgrading from wildcard.pdfpal and wc.pageturner anymore. Use 3.x series [vangheem]

3.0.3 (2015-07-29)

set response header on javascript variable file. Prevents js errors on chrome. [vangheem]

3.0.2 (2014-05-31)

fix bug where it wouldn’t work with collective.geo.* [vangheem]

3.0.1 (2014-05-08)

add german translation [jhb]

3.0a1 (2013-09-03)

Add Dexterity compatibility. To enable it on your content type, you have to define a primary field and add documentviewer in the available view methods, see documentation. [vincentfretin]
Fix: users that can modify can now view info messages and ‘annotations’/’sections’ feature. [thomasdesvenain]
Show contributor fullname if possible. Contributor and organization are in a span. [thomasdesvenain]
Avoid replacing non-ascii characters by (?) during OCR process for non english languages. [thomasdesvenain]
Plain text indexation is fixed for non converted contents. [thomasdesvenain]
When a new release of the document is currently generated, user is notified by a status message. [thomasdesvenain]

2.2.2b3 (2013-05-31)

i18n fixes + french translations [thomasdesvenain]
support to pass a document language to tesseract/docsplit based on a configurable adapter implementing IOCRLanguage [ajung]

2.2.2b2 (2013-05-31)

fix bug when using blob storage and text indexing is disabled [gbastien]

2.2.2b1 (2013-05-31)

only use defaultFactory when supported. For older versions of zope.schema [vangheem]

2.2.2a1 (2013-05-31)

added french translations [gbastien]
added enable_indexation parameter in global and local settings Fixes : https://github.com/collective/collective.documentviewer/issues/21 [gbastien]
make local settings coherent regarding global settings Fixes : https://github.com/collective/collective.documentviewer/issues/22 [gbastien]

2.2.1 (2013-03-12)

fix use with latest libreoffice and docsplit. Fixes: https://github.com/collective/collective.documentviewer/issues/11
do not require docsplit to be installed on the plone instance in order to display the viewer. In case the document was converted on another client. [vangheem]

2.2 (2013-02-06)

fix z-index on viewer [damilgra]

2.2b2 (2013-01-10)

fix getSite imports for plone 4.3

2.2b1 (2013-01-06)

switch to using OFS.interfaces.IFolder for folder view [vangheem]
while pdf is converting, show existing if available. [vangheem]
move convert button to actions [vangheem]

2.2a2 (2012-10-01)

another subsite fix [vangheem]

2.2a1 (2012-xx-xx)

test for Plone 4.2 compatibility. [hvelarde]
work with subsites

2.1b2 (2012-06-22)

better handling of moving folders around

2.1b1 (2012-06-22)

be able to obfuscate file paths for file storage

2.0.4 (2012-06-21)

fix cleaning file location
fix potential tranversal error for file resources

2.0.3 (2012-06-13)

check for quota set before finding existing jobs.

2.0.2 (2012-06-12)

include contentmenu zcml dependency
upgrade conversion will now try and fix error’d conversions

2.0.1 (2012-05-15)

fixing batching on group view

2.0.1b1 (2012-05-14)

add support for new formats: star office, ps, photoshop, visio, palm

2.0b1 (2012-05-11)

add ability to add annotations and sections

1.5.1 (2012-04-30)

fix security on file resources

1.5.0 (2012-04-29)

no changes

1.5.0b1 (2012-04-27)

be able to move jobs to front of queue
use portal_catalog instead of uid_catalog so security checks apply to resource urls.

1.4.2 (2012-04-24)

no changes, first final release

1.4.1b3 (2012-04-23)

create local catalog and index before syncing db to prevent conflict errors.
add redirect timeout to conversion info page

1.4.1b2 (2012-04-23)

make sure to close open file descriptors
Change “Original Document (PDF)” to “Original Document”
emit event after conversion
only show queue link if manager
convert button should work for files that do not have layout selected yet
use communicate instead of wait with popen in case output is large. Prevents deadlocks.

1.4.1b1 (2012-04-23)

do not assume pdfpal is used along with pageturner on data conversion.
better command runner
track errors better and display them in interface if something happened during conversion
new file storage structure to prevent too many files from being in one directory

1.4b1 (2012-04-21)

fix full screen button when text or pages selected.
be able to customize batch size

1.4a2 (2012-04-20)

make sure to not use files with spaces

1.4a1 (2012-04-20)

be able to detect if pdf already has text in it and do not OCR it if it does.

1.3b2 (2012-04-20)

use jQuery instead of $()

1.3b1 (2012-04-20)

default OCR to being off since it’s pretty slow
better logging when looking for binary files
be able to override width of viewer

1.3a3 (2012-04-20)

fix uninstall [vangheem]

1.3a2 (2012-04-19)

fix async bug if it wasn’t installed [vangheem]

1.3a1 (2012-04-19)

make sure to initialize catalog after db sync for large PDFs. [vangheem]
better integrate with pdfpal and pageturner so it’s easy to upgrade from those products. [vangheem]

1.2a2 (2012-04-19)

fix setting custom quota for async queue [vangheem]
fix group view clear button [vangheem]
add support for alternative md5sum binary [vangheem]

1.2a1 (2012-04-19)

fix full screen page bug [vangheem]
better async integration with quota setting [vangheem]
View async queue for conversions [vangheem]
index ocr data in portal catalog [vangheem]
better pdf group view with search [vangheem]
handle large files better [vangheem]
check if file has already been converted by storing hash of the file to check against. [vangheem]
be able to remove document viewer conversion tasks [vangheem]
add ability to cleanup file storage files for deleted plone File objects. [vangheem]

1.1a1 (2012-04-18)

add pdf folder album view [vangheem]
fix async integration [vangheem]

1.0a2 (2012-04-17)

add control panel icon [vangheem]
fix uninstall procedure [vangheem]
changing image type does not cause existing ones to fail. [vangheem]

1.0a1 (2012-04-17)

Initial release

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

6.0.3

May 17, 2023

6.0.2

Apr 27, 2023

6.0.1

Feb 3, 2021

6.0.0

May 11, 2020

5.0.4

Jan 9, 2019

5.0.3

Dec 11, 2018

5.0.2

Dec 11, 2018

5.0.1

Jul 25, 2018

5.0.0

Sep 2, 2016

4.1.0

May 15, 2017

4.0.14

Aug 1, 2016

4.0.13

Jul 22, 2016

4.0.12

Jul 22, 2016

4.0.11

Jul 12, 2016

4.0.10

Jun 30, 2016

4.0.9

Jun 30, 2016

4.0.8

Jun 21, 2016

4.0.7

Jun 21, 2016

4.0.6

Jan 27, 2016

4.0.5

Jan 25, 2016

This version

4.0.4

Jan 25, 2016

4.0.3

Sep 30, 2015

4.0.2

Sep 30, 2015

4.0.1

Sep 28, 2015

4.0.0

Sep 9, 2015

3.0.3

Jul 29, 2015

3.0.2

May 31, 2014

3.0.1

May 9, 2014

3.0

Dec 18, 2013

3.0a1 pre-release

Sep 3, 2013

2.2.2b3 pre-release

May 31, 2013

2.2.2b2 pre-release

May 31, 2013

2.2.2b1 pre-release

May 31, 2013

2.2.2a1 pre-release

May 31, 2013

2.2.1

Mar 12, 2013

2.2

Feb 7, 2013

2.2b3 pre-release

Jan 10, 2013

2.2b1 pre-release

Jan 7, 2013

2.2a2 pre-release

Oct 1, 2012

2.2a1 pre-release

Jul 24, 2012

2.1b2 pre-release

Jun 22, 2012

2.1b1 pre-release

Jun 22, 2012

2.0.4

Jun 21, 2012

2.0.3

Jun 13, 2012

2.0.2

Jun 12, 2012

2.0.1

May 15, 2012

2.0.1b1 pre-release

May 14, 2012

2.0b1 pre-release

May 11, 2012

1.5.1

Apr 30, 2012

1.5.0

Apr 29, 2012

1.5.0b1 pre-release

Apr 27, 2012

1.4.2

Apr 24, 2012

1.4.1b3 pre-release

Apr 23, 2012

1.4.1b2 pre-release

Apr 23, 2012

1.4.1b1 pre-release

Apr 23, 2012

1.4b1 pre-release

Apr 21, 2012

1.4a2 pre-release

Apr 20, 2012

1.4a1 pre-release

Apr 20, 2012

1.3b2 pre-release

Apr 20, 2012

1.3b1 pre-release

Apr 20, 2012

1.3a3 pre-release

Apr 20, 2012

1.3a2 pre-release

Apr 20, 2012

1.3a1 pre-release

Apr 20, 2012

1.2a2 pre-release

Apr 19, 2012

1.2a1 pre-release

Apr 19, 2012

1.1a1 pre-release

Apr 18, 2012

1.0a2 pre-release

Apr 18, 2012

1.0a1 pre-release

Apr 17, 2012

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

collective.documentviewer-4.0.4.zip (336.2 kB view details)

Uploaded Jan 25, 2016 Source

File details

Details for the file collective.documentviewer-4.0.4.zip.

File metadata

Download URL: collective.documentviewer-4.0.4.zip
Upload date: Jan 25, 2016
Size: 336.2 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for collective.documentviewer-4.0.4.zip
Algorithm	Hash digest
SHA256	`ddf947a4c7dd2bccbff6659bb4d9c3076296ecf21d4e8fe9853ac62fe524fdfe`
MD5	`164c7f7ee80ab3c3c38fce8900751a1b`
BLAKE2b-256	`a0f632589515866c00083b07f10a09fa65589d5ac89069a107e578a8a163ca7a`

See more details on using hashes here.

collective.documentviewer 4.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Introduction

Features

Works with

Install requirements

Async Integration

Settings

Dexterity support

File storage integration

Upgrading from page turner

Installation on Cent OS/Red hat

Installation

TODO

Changelog

4.0.4 (2016-01-25)

4.0.3 (2015-09-30)

4.0.2 (2015-09-30)

4.0.1 (2015-09-28)

4.0.0 (2015-09-09)

3.0.3 (2015-07-29)

3.0.2 (2014-05-31)

3.0.1 (2014-05-08)

3.0a1 (2013-09-03)

2.2.2b3 (2013-05-31)

2.2.2b2 (2013-05-31)

2.2.2b1 (2013-05-31)

2.2.2a1 (2013-05-31)

2.2.1 (2013-03-12)

2.2 (2013-02-06)

2.2b2 (2013-01-10)

2.2b1 (2013-01-06)

2.2a2 (2012-10-01)

2.2a1 (2012-xx-xx)

2.1b2 (2012-06-22)

2.1b1 (2012-06-22)

2.0.4 (2012-06-21)

2.0.3 (2012-06-13)

2.0.2 (2012-06-12)

2.0.1 (2012-05-15)

2.0.1b1 (2012-05-14)

2.0b1 (2012-05-11)

1.5.1 (2012-04-30)

1.5.0 (2012-04-29)

1.5.0b1 (2012-04-27)

1.4.2 (2012-04-24)

1.4.1b3 (2012-04-23)

1.4.1b2 (2012-04-23)

1.4.1b1 (2012-04-23)

1.4b1 (2012-04-21)

1.4a2 (2012-04-20)

1.4a1 (2012-04-20)

1.3b2 (2012-04-20)

1.3b1 (2012-04-20)

1.3a3 (2012-04-20)

1.3a2 (2012-04-19)

1.3a1 (2012-04-19)

1.2a2 (2012-04-19)

1.2a1 (2012-04-19)

1.1a1 (2012-04-18)

1.0a2 (2012-04-17)

1.0a1 (2012-04-17)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details