Skip to main content

A Plone product that generates image thumbnail previewsof PDF files stored on ATCT based objects.

Project description

Introduction
============

PdfPeek is a Plone 4 add-on product that utilizes GNU Ghostscript to generate
image thumbnail previews of PDF files uploaded to ATFile based content
objects.

This product, when installed in a Plone 4.x site, will automatically generate
preview and thumbnail images of each page of uploaded PDF files and store
them annotated onto the content object containing the PDF file.

Image generation from the PDF file is processed asynchronously so that the user
does not have to wait for the images to be created in order to continue using
the site, as the processing of large PDF files can take many minutes to complete.

When a file object is initialized or edited, PdfPeek checks to see if a PDF file
was uploaded. If so, a ghostscript image conversion job is added to the pdfpeek
job queue. If the file uploaded is not of content type 'application/pdf', an
image removal job is added to the pdfpeek job queue. This job queue is processed
periodically by a cron job or a zope clock server process. The image conversion
jobs add the IPDF interface to the content object and store the resulting image
preview and thumbnail for each page of the PDF annotated on to the content
object itself. The image removal jobs remove the image annotations and the IPDF
interface from the content object.

If a job fails, it is removed from the processing queue and appended to a list
of failed jobs. If a job succeeds, it is removed from the processing queue and
appended to a list of successfully completed jobs.

PdfPeek ships with an example user interface that is turned on by default. This
UI displays the thumbnail images of each page of the PDF file when a user views
the content object in their browser. This example UI is not quite working yet,
and is meant to be just that, an example. I don't claim to be a javascript
master.

A custom traverser is available to make it easy to access the images and
previews directly, as well as to build custom views incorporating image
previews of file content.

PdfPeek ships with a configlet that allows the site administrator to adjust the
size of the generated preview and thumbnail images, as well as toggle the
example user interface and default event handlers on and off.

**Requires the GNU ghostscript gs binary to be available on the $PATH!**

*Tested on POSIX compliant systems such as LINUX and MacOS 10.6. Untested on*
*Windows systems.*
*(Wouldn't be surprised if it works, as long as you can install gs.)*

*As of version 0.17, Plone 3.x is no longer officially supported.*

* Code repository: https://svn.plone.org/svn/collective/collective.pdfpeek
* Questions and comments to db@davidbrenneman.com
* Report bugs to db@davidbrenneman.com


Usage
=====

The recommended method of using collective.pdfpeek is by installing via
buildout.
PdfPeek uses z3c.autoinclude to load it's zcml, so you don't need a zcml slug.

Add collective.pdfpeek to the list of eggs in the instance section of your
buildout.cfg like so::

[instance]
...
eggs =
...
collective.pdfpeek
...

Then re-run your buildout like so::

bin/buildout

For automatic processing of the PdfPeek job queue, a simple cron script using
curl or wget would suffice. It is nice to keep all of the configuration for a
project in your buildout, however. For this reason, a zope clock server process
is the recommended way to automatically process the job queue. You can do so by
adding the following snippet to your [instance] part in your buildout
configuration::

[instance]
...
zope-conf-additional=
# process the job queue every 5 seconds
<clock-server>
method /Plone/@@pdfpeek.utils/process_conversion_queue
period 5
user admin
password admin
host localhost
</clock-server>
...

You will have to edit the above snippet to customize the name of the plone site,
the admin username and password, and the hostname the instance is running on.
You can also adjust the interval at which the queue is processed by the clock
server.

collective.pdfpeek Installation
-------------------------------

To install collective.pdfpeek into the global Python environment (or a virtualenv),
using a traditional Zope 2 instance, you can do this:

* When you're reading this you have probably already run
``easy_install collective.pdfpeek``. Find out how to install setuptools
(and EasyInstall) here:
http://peak.telecommunity.com/DevCenter/EasyInstall

* If you are using Zope 2.9 (not 2.10), get `pythonproducts`_ and install it
via::

python setup.py install --home /path/to/instance

into your Zope instance.

* Create a file called ``collective.pdfpeek-configure.zcml`` in the
``/path/to/instance/etc/package-includes`` directory. The file
should only contain this::

<include package="collective.pdfpeek" />

.. _pythonproducts: http://plone.org/products/pythonproducts


Alternatively, if you are using zc.buildout and the plone.recipe.zope2instance
recipe to manage your project, you can do this:

* Add ``collective.pdfpeek`` to the list of eggs to install, e.g.::

[buildout]
...
eggs =
...
collective.pdfpeek


* Re-run buildout, e.g. with::

$ ./bin/buildout

You can skip the ZCML slug if you are going to explicitly include the package
from another package's configure.zcml file.

Changelog
=========

0.17 (2010-2-26)
-----------------

- Added wide variety of pdf files to run through the unit tests for the
ghostscript image transform.
[dbrenneman]

- Added unit tests for low level ghostscript transform.
[dbrenneman]

- Refactored transform code to make class and method names make more sense.
[dbrenneman]

- Updated README, including instructions for configuring the clock server.
[dbrenneman]

- Added asyncronous processing queue for ghostscript transform jobs.
[dbrenneman]

- Updated functional doctests to work on Plone 4 with blobfile storage.
[dbrenneman]

- Updated functional doctests to test transform queue.
[dbrenneman]

- Updated documentation.
[dbrenneman]

- Added unit testing harness.
[dbrenneman]

0.16 (2009-12-12)
----------------

- Bugfix release.
[dbrenneman]

0.15 (2009-12-12)
-----------------

- Added configurable preview and thumbnail sizes.
[claytron]

- reST police! Fixing up the docs so that they might get rendered
correctly.
[claytron]

0.13 (2009-11-12)
-----------------

- Refactored transform code to deal with encrypted pdf files better.
[dbrenneman]

- Made transform code more robust.
[dbrenneman]

- Added ability to toggle default event handler on and off.
[dbrenneman]

0.12 (2009-10-25)
-----------------

- Bugfix release.
[dbrenneman]

0.11 (2009-10-25)
-----------------

- Bugfix release.
[dbrenneman]

0.10 (2009-10-25)
-----------------

- Added code to check for EOF at the end of the pdf file data string and to
insert one if it is not there. Fixes many corrupt pdf files.
[dbrenneman]

0.9 (2009-10-13)
----------------

- Fixed another bug in the transform code to allow functioning with any
filefield, as long as it is called file.
[dbrenneman]

0.8 (2009-10-13)
----------------

- Fixed a bug in the transform code to allow functioning with any filefield,
as long as it is called file.
[dbrenneman]

0.7 (2009-10-13)
----------------

- Streamlined transform code.
[dbrenneman]

- Added ability to toggle the pdfpeek viewlet display on and off via configlet.
[dbrenneman]

0.6 (2009-10-05)
----------------

- Bugfix release.
[dbrenneman]

0.5 (2009-10-05)
----------------

- Added control panel configlet.
[dbrenneman]

- Removed unneeded xml files from uninstall profile.
[dbrenneman]

- Optimized transform.
[dbrenneman]

- Added storage of image thumbnail along with image, generated with PIL.
[dbrenneman]

- Changed annotation to store images in a dict instead of a list.
[dbrenneman]

- Changed event handler to listen on all AT based objects instead of ATFile.
[dbrenneman]

- Added custom pdfpeek icon for configlet.
[dbrenneman]

- Added custom traverser to allow easy access to the OFS.Image.Image()
objects stored on IPDF objects.
[dbrenneman]

- Modified pdfpeek viewlet code to display images using the custom traverser.
[dbrenneman]

- Added custom scrollable gallery with tooltips using jQuery Tools to the
pdfpeek viewlet for display.
[dbrenneman]

0.4 (2009-10-01)
----------------

- Refactored storage to use OFS.Image.Image() objects instead of storing the
raw binary data in string format.
[dbrenneman]

- Refactored event handler object variable name.
[dbrenneman]

- Removed unneeded files from default GS Ext. profile.
[dbrenneman]

- Removed unneeded javascript files and associated images and css.
[dbrenneman]

0.3 - 2009-08-03
----------------

- fixed parsing of pdf files with multiple pages
[piv]

0.1 - Unreleased
----------------

- Initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

collective.pdfpeek-0.17.tar.gz (11.1 MB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page