Skip to main content

Read and write PDFs with Python, powered by qpdf

Project description

pikepdf
=======

**pikepdf** is a Python library for reading and writing PDF files.

.. |travis| image:: https://img.shields.io/travis/pikepdf/pikepdf/master.svg?label=Linux%2fmacOS%20build
:target: https://travis-ci.org/pikepdf/pikepdf
:alt: Travis CI build status (Linux and macOS)

.. |windows| image:: https://img.shields.io/appveyor/ci/jbarlow83/pikepdf/master.svg?label=Windows%20build
:target: https://ci.appveyor.com/project/jbarlow83/pikepdf
:alt: AppVeyor CI build status (Windows)

.. |pypi| image:: https://img.shields.io/pypi/v/pikepdf.svg
:target: https://pypi.org/project/pikepdf/
:alt: PyPI


|travis| |windows| |pypi|

pikepdf is based on `QPDF <https://github.com/qpdf/qpdf>`_, a powerful PDF
manipulation and repair library.

Python + QPDF = "py" + "qpdf" = "pyqpdf", which looks like a dyslexia test. Say it
out loud, and it sounds like "pikepdf".

Python 3.5, 3.6 and 3.7 are fully supported.

**To install:**

.. code-block:: bash

pip install pikepdf

Key features:

- Editing, manipulation and transformation of existing PDFs
- Based on the mature, proven QPDF C++ library
- Works with encrypted PDFs
- Supports all PDF compression filters
- Can create "fast web view" (linearized) PDFs
- Creates standards compliant PDFs that pass validation in other tools
- Automatically repairs damaged PDFs, just like QPDF
- Implements more of the PDF specification than existing Python PDF tools
- IPython notebook and Jupyter integration

.. code-block:: python

# Elegant, Pythonic API
pdf = pikepdf.open('input.pdf')
num_pages = len(pdf.pages)
del pdf.pages[-1]
pdf.save('output.pdf')


pikepdf is `documented <https://pikepdf.readthedocs.io/en/latest/index.html>`_
and actively maintained. Commercial support is available.

Feature comparison
------------------

This library is similar to PyPDF2 and pdfrw - it provides low level access to PDF
features and allows editing and content transformation of existing PDFs. Some
knowledge of the PDF specification may be helpful. It does not have the
capability to render a PDF to image.

Python 2.7 and earlier versions of Python 3 are not currently supported but
support is probably not difficult to achieve. Pull requests are welcome.

+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| **Feature** | **pikepdf** | **PyPDF2** | **pdfrw** |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Editing, manipulation and transformation of existing PDFs | ✔ | ✔ | ✔ |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Based on an existing, mature PDF library | QPDF | ✘ | ✘ |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Implementation speed | C++ | Python | Python |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| PDF versions supported | 1.1 to 1.7 | 1.3? | 1.7 |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Python versions supported | 3.5-3.7 | 2.6-3.6 | 2.6-3.6 |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Supports password protected (encrypted) PDFs | ✔ (except public key) | Only obsolete RC4 | ✘ |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Save and load PDF compressed object streams (PDF 1.5) | ✔ | ✘ | ✘ |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Creates linearized ("fast web view") PDFs | ✔ | ✘ | ✘ |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Actively maintained | |commits| | |pypdf2-commits| | |pdfrw-commits| |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Test suite coverage | ~86% | very low | unknown |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Creates PDFs that pass PDF validation tests | ✔ | ✘ | ? |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Modifies PDF/A without breaking PDF/A compliance | ✔ | ✘ | ? |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Automatically repairs PDFs with internal errors | ✔ | ✘ | ✘ |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Documentation | ✔ | ✘ | ✔ |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+
| Integrates with Jupyter and IPython notebooks for rapid development | ✔ | ✘ | ✘ |
+---------------------------------------------------------------------+-----------------------+-------------------+-----------------+

License
-------

pikepdf is provided under the `Mozilla Public License 2.0 <https://www.mozilla.org/en-US/MPL/2.0/>`_
license (MPL) that can be found in the LICENSE file. By using, distributing, or
contributing to this project, you agree to the terms and conditions of this license.
We exclude Exhibit B, so pikepdf is compatible with secondary licenses.
At your option may additionally distribute pikepdf under a secondary license.

`Informally <https://www.mozilla.org/en-US/MPL/2.0/FAQ/>`_, MPL 2.0 is a not a "viral" license.
It may be combined with other work, including commercial software. However, you must disclose your modifications
*to pikepdf* in source code form. In other works, fork this repository on Github or elsewhere and commit your
contributions there, and you've satisfied the license.

The ``tests/resources/copyright`` file describes licensing terms for the test
suite and the provenance of test resources.


.. |commits| image:: https://img.shields.io/github/commit-activity/y/pikepdf/pikepdf.svg
:target: https://github.com/pikepdf/pikepdf/graphs/commit-activity

.. |pypdf2-commits| image:: https://img.shields.io/github/commit-activity/y/mstamy2/PyPDF2.svg
:target: https://github.com/mstamy2/PyPDF2/graphs/commit-activity

.. |pdfrw-commits| image:: https://img.shields.io/github/commit-activity/y/pmaupin/pdfrw.svg
:target: https://github.com/pmaupin/pdfrw/graphs/commit-activity


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
pikepdf-0.3.1-cp35-cp35m-macosx_10_6_intel.whl (1.0 MB) Copy SHA256 hash SHA256 Wheel cp35 Aug 13, 2018
pikepdf-0.3.1-cp35-cp35m-manylinux1_x86_64.whl (7.1 MB) Copy SHA256 hash SHA256 Wheel cp35 Aug 13, 2018
pikepdf-0.3.1-cp35-cp35m-win_amd64.whl (854.8 kB) Copy SHA256 hash SHA256 Wheel cp35 Aug 12, 2018
pikepdf-0.3.1-cp36-cp36m-macosx_10_6_intel.whl (1.0 MB) Copy SHA256 hash SHA256 Wheel cp36 Aug 13, 2018
pikepdf-0.3.1-cp36-cp36m-manylinux1_x86_64.whl (7.1 MB) Copy SHA256 hash SHA256 Wheel cp36 Aug 13, 2018
pikepdf-0.3.1-cp36-cp36m-win_amd64.whl (854.8 kB) Copy SHA256 hash SHA256 Wheel cp36 Aug 12, 2018
pikepdf-0.3.1-cp37-cp37m-macosx_10_6_intel.whl (1.0 MB) Copy SHA256 hash SHA256 Wheel cp37 Aug 13, 2018
pikepdf-0.3.1-cp37-cp37m-manylinux1_x86_64.whl (7.1 MB) Copy SHA256 hash SHA256 Wheel cp37 Aug 13, 2018
pikepdf-0.3.1-cp37-cp37m-win_amd64.whl (854.8 kB) Copy SHA256 hash SHA256 Wheel cp37 Aug 12, 2018
pikepdf-0.3.1.tar.gz (1.6 MB) Copy SHA256 hash SHA256 Source None Aug 13, 2018

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page