Skip to main content

pp.server - Produce & Publish Server

Project description

pp.server - Produce & Publish Server
====================================

``pp.server`` is a Pyramid based server implementation and implements the
server side functionality of the Produce & Publish platform. It is known as
the ``Produce & Publish Server``.

The Produce & Publish Server provided web service APIs for converting
HTML/XML + assets to PDF using one of the following external PDF converters:

- PrinceXML (www.princexml.com, commercial)
- PDFreactor (www.realobjects.com, commercial)
- PhantomJS (free, unsupported)
- Speedata Publisher (www.speedata.de, open-source, experimental support)
- WKHTMLTOPDF (www.wkhtmltopdf.org, open-source, experimental support)
- Vivliostyle Formatter (www.vivliostyle.com, commercial, experimental support)
- Antennahouse 6.2 (www.antennahouse.com, commercial)

In addition there is experimental support for generating EPUB documents
using ``Calibre`` (www.calibre.org, open-source).

In addition the Produce & Publish server provides a simple conversion
API for converting format A to B (as supported through LibreOffice
or OpenOffice). The conversion is build on top of ``unoconv``.

The web service provides both synchronous and asynchronous operations.

Requirements
------------

- Python 3.3 or higher, no support for Python 2.x

- the external binaries

- PrinceXML: ``prince``,
- PDFreactor up to version 7: ``pdfreactor``,
- PDFreactor version 8 or higher: ``pdfreactor8``,
- Unoconv: ``unoconv``
- Speedata Publisher: ``sp``
- Calibre: ``ebook-convert``
- WKHTMLTOPDF: ``wkhtmltopdf``
- Vivliostyle: ``vivliostyle-formatter``
- Antennahouse: ``run.sh``

must be in the $PATH. Please refer to the installation documentation
of the individual products.

Installation
------------

- create an ``virtualenv`` environment (Python 2.7 (or Python 3.4)) - either within your
current (empty) directory or by letting virtualenv create one for you.
(``easy_install virtualenv`` if ``virtualenv`` is not available on your
system)::

virtualenv --no-site-packages .

or::

virtualenv --no-site-packages pp.server

- install the Produce & Publish server::

bin/pip install pp.server

- create a ``server.ini`` configuration file (and change it according to your needs)::

[DEFAULT]
debug = true

[app:main]
use = egg:pp.server
reload_templates = true
debug_authorization = false
debug_notfound = false

[server:main]
use = egg:waitress#main
host = 127.0.0.1
port = 6543

- start the server (in foreground)::

bin/pserve server.ini

- or start it in background::

bin/pserve server.ini --daemon

Converter requirements
----------------------

For the PDF conversion the related converter binaries or scripts
must be included in the ``$PATH`` of your server.

- ``prince`` for PrinceXML

- ``pdfreactor`` for PDFreactor 7

- ``pdfreactor8`` for PDFreactor 8 or higher

- ``phantomjs`` for PhantomJS

- ``wkhtmltopdf`` for WKHTMLToPDF

- ``ebook-convert`` for Calibre

- ``sp`` for the Speedata Publisher

- ``vivliostyle`` for the Vivliostyle Formatter

- ``antennahouse`` for the Antennahouse

API documentation
-----------------

All API methods are available through a REST api
following API URL endpoint::

http://host:port/api/1/<command>

With the default server configuration this translates to::

http://localhost:6543/api/1/pdf

or

http://localhost:6543/api/1/unoconv


PDF conversion API
++++++++++++++++++

Remember that all converters use HTML or XML as input for the conversion. All
input data (HTML/XML, images, stylesheets, fonts etc.) must be stored in ZIP
archive. The filename of the content **must** be named ``index.html``.

You have to ``POST`` the data to the

http://host:port/api/1/pdf

with the following parameters:


- ``file`` - the ZIP archive (multi/part encoding)

- ``converter`` - a string that determines the the PDF
converter to be used (either ``princexml``, ``pdfreactor``, ``phantomjs``, ``vivliostyle``,
or ``calibre`` for generating EPUB content)

- ``async`` - asynchronous ("1") or synchronous conversion ("0", default)

- ``cmd_options`` - an optional string of command line parameters added
as given to the calls of the externals converters


Returns:

The API returns its result as JSON structure with the following key-value
pairs:

- ``status`` - either ``OK`` or ``ERROR``

- ``data``- the generated PDF file encoded as base64 encoded byte string

- ``output`` - the conversion transcript (output of the converter run)


Unoconv conversion API
++++++++++++++++++++++

The unoconv web service wraps the OpenOffice/LibreOffice server mode
in order to perform document conversion (mainly used in the Produce & Publish
world for convertering DOC(X) documents to HTML/XML).

Remember that all converters use HTML or XML as input for the conversion. All
input data (HTML/XML, images, stylesheets, fonts etc.) must be stored in ZIP
archive. The filename of the content **must** be named ``index.html``.

You have to ``POST`` the data to the

http://host:port/api/1/unoconv

with the following parameters:


- ``file`` - the source files (multi/part encoding)

- ``async`` - asynchronous ("1") or synchronous conversion ("0", default)

- ``cmd_options`` - an optional string of command line parameters added
as given to the ``unoconv`` calls

Returns:

The API returns its result as JSON structure with the following key-value
pairs:

- ``status`` - either ``OK`` or ``ERROR``

- ``data`` - the converted output files as ZIP archive (e.g.
a DOCX file containing images will be converted to a HTML file
plus the list of extract image files)

- ``output`` - the conversion transcript (output of the converter run)

Asynchronous operations
+++++++++++++++++++++++

If you set ``async`` to '1' in the API calls above then both calls
will return a JSON datastructure like

{'job_id': <some id>}


The ``job_id`` can be used to poll the Produce &amp; Publish Server
in order to retrieve the conversion result asynchronously.

The poll API is provided through the URL

http://host:port/api/1/poll/<job_id>

If the conversion is still pending the API will return a JSON
document

{'done': False}

If the conversion has finished then a PDF/Unoconv specific
return JSON document will be return (same format as for the synchronous
API calls). In addition the key-value pair {'done': True} will be included
with the JSOn reply.

Introspection API methods
+++++++++++++++++++++++++

Produce & Publish server version:

http://host:port/api/version

returns:

{"version": "0.3.2", "module": "pp.server"}

Installed/available converters:

http://host:port/api/converters

returns:

{"unoconv": true, "pdfreactor": true, "phantomjs": false, "calibre": true, "princexml": true}


Versions of installed converter:

http://host:port/api/converter-versions

returns:

{'princexml': 'Version x.y', 'pdfreactor: 'Version a.b.c', ...}


Other API methods
+++++++++++++++++

Cleanup of the queue directory (removes conversion data older than one day)

http://host:port/api/cleanup

returns:

{"directories_removed": 22}

Authorization support
---------------------

The ``pp.server`` implementation provides a simple and optional authorization
mechanism by accepting a ``pp-token`` header from the client. In order to
enable the authorization support on the server side you need to configure the
authenticator method and the authorization token in your .ini file::

[app:main]
use = egg:pp.server
...
pp.authenticator = token_auth
pp.authentication_token = 12345

The ``token_auth`` string refers to a method in ``pp.server.authorization``
which is a simple authorization method (for the beta phase) supporting only one
token for now. The token is configured through the ``pp.authentication_token``
value.

Any client sending a HTTP request to the ``pp.server`` server instance is required
to send a HTTP header for authorization (if enabled on the server)::

pp-token: <value of token>
pp-token: 12345


Advanced installation issues
----------------------------

Installation of PDFreactor using zc.buildout
++++++++++++++++++++++++++++++++++++++++++++

- https://bitbucket.org/ajung/pp.server/raw/master/pdfreactor.cfg

Installation of PrinceXML using zc.buildout
+++++++++++++++++++++++++++++++++++++++++++

- https://bitbucket.org/ajung/pp.server/raw/master/princexml.cfg

Production setup
++++++++++++++++

``pserve`` and ``celeryd`` can be started automatically and
controlled using ``Circus``. Look into the following buildout
configuration

- https://bitbucket.org/ajung/pp.server/raw/master/circus-app.ini

Source code
-----------

https://bitbucket.org/ajung/pp.server

Bug tracker
-----------

https://bitbucket.org/ajung/pp.server/issues

Support
-------

Support for Produce & Publish Server is currently only available on a project
basis.

License
-------
``pp.server`` is published under the GNU Public License V2 (GPL 2).

Contact
-------

| ZOPYX
| Hundskapfklinge 33
| D-72074 Tuebingen, Germany
| info@zopyx.com
| www.zopyx.com
| www.produce-and-publish.info


0.7.10 (2016/06/01)
------------------
- updated to Pyramid 1.7

0.7.7 (2016/01/24)
------------------
- updated support for latest Vivliostyle formatter
- added support for Antennahouse Formatter

0.7.6 (2015/11/30)
------------------
- support for PDFreactor 8

0.7.5 (2015/11/18)
------------------
- fixed race condition while creating directories

0.7.4 (2015/11/14)
------------------
- support for nested uploaded ZIP files

0.7.3 (2015/11/14)
------------------
- support for Vivliostyle Formatter

0.7.2 (2015/04/20)
------------------
- merged https://bitbucket.org/ajung/pp.server/pull-request/1/
(improper check for wkhtmltopdf)
- merged https://bitbucket.org/ajung/pp.server/pull-request/2/
(fix for async operations)

0.7.1 (2015/03/13)
------------------
- unicode fix in runcmd()

0.7.0 (2015/02/15)
------------------

- 0.6.x was completely badly packaged
- changed repo structure

0.6.1 (2015/02/02)
------------------
- add /api/converter-versions to webservice API

0.6.0 (2015/01/26)
------------------
- dropped Python 2.X support, Python 3.3 or higher
is now a mandatory requirement

0.5.5 (2015/01/23)
------------------
- UTF8 handling fix

0.5.3 (2014/11/23)
------------------
- support for WKHTMLTOPDF

0.5.2 (2014/11/19)
------------------
- support for Speedata Publisher

0.5.1 (2014/10/12)
------------------
- improved error handling

0.5.0 (2014/10/12)
------------------
- official Python 3.3/3.4 support

0.4.7 (25.09.2014)
------------------
- fixed documentation bug

0.4.6 (22.08.2014)
------------------
- removed PDFreactor --addlog option

0.4.5 (22.08.2014)
------------------
- added supplementary commandline options to pdfreactor commandline call

0.4.4 (24.01.2014)
------------------
- minor typos fixed

0.4.3 (20.01.2014)
------------------
- implemented automatic queue cleanup after one day

0.4.2 (18.01.2014)
------------------
- URL fix in index.pt related to virtual hosting

0.4.1 (13.01.2014)
------------------
- show Python version and converters on index.pt
- authorization support added

0.4.0 (17.10.2013)
------------------
- Python 3.3 support
- Pyramid 1.5 support

0.3.5 (05.10.2013)
------------------
- added 'cmd_options' to pdf and unoconv API
methods for specifying arbitary command line parameters
for the external converters

0.3.4 (05.10.2013)
------------------
- added 'cleanup' API

0.3.3 (05.10.2013)
------------------
- added 'version' and 'converter' API methods

0.3.2 (04.10.2013)
------------------
- added support EPUB conversion using ``Calibre``

0.3.1 (03.10.2013)
------------------
- updated documentation

0.3.0 (14.07.2013)
------------------
- unoconv conversion now returns a ZIP archive
(e.g. a HTML file + extracted images)

0.2.7 (06.07.2013)
------------------
- added support for Phantom.js converter

0.2.5 (05.07.2013)
------------------
- better detecting of prince and pdfreactor binaries

0.2.2 (05.07.2013)
------------------
- updated the documentation
- minor cleanup

0.2.1 (04.07.2013)
------------------
- re-added poll API

0.2.0 (03.07.2013)
------------------
- converted XML-RPC api to REST api

0.1.9 (01.07.2013)
------------------
- monkeypatch pyramid_xmlrpc.parse_xmlrpc_request in order
to by-pass its stupid DOS request body check

0.1.7 (29.06.2013)
------------------
- more tests
- fixes
- updated documentation

0.1.5 (27.06.2013)
------------------
- test for synchronous operations
- fixes

0.1.0 (24.06.2013)
------------------
- initial release

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pp.server-0.7.10.zip (201.9 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page