Skip to main content

Rebased Python bindings for the PDF toolkit and renderer MuPDF - without shared libraries

Project description

PyMuPDF 1.23.0rc1

logo

Release date: August 10, 2023

On PyPI since August 2016: Downloads

Author

Artifex, based on code by Jorj X. McKie and Ruikai Liu.

Introduction

PyMuPDF adds Python bindings and abstractions to MuPDF, a lightweight PDF, XPS, and eBook viewer, renderer, and toolkit. Both PyMuPDF and MuPDF are maintained and developed by Artifex Software, Inc.

MuPDF can access files in PDF, XPS, OpenXPS, CBZ, EPUB, MOBI and FB2 (eBooks) formats, and it is known for its top performance and exceptional rendering quality.

With PyMuPDF you can access files with extensions like .pdf, .xps, .oxps, .cbz, .fb2, .mobi or .epub. In addition, about 10 popular image formats can also be handled like documents: .png, .jpg, .bmp, .tiff, .svg etc.

Usage

For all supported document types (i.e. including images) you can

  • Decrypt the document.
  • Access meta information, links and bookmarks.
  • Render pages in raster formats (PNG and some others), or the vector format SVG.
  • Search for text.
  • Extract text and images.
  • Convert to other formats: PDF, (X)HTML, XML, JSON, text.
  • Do OCR (Optical Character Recognition) if Tesseract is installed.

To some degree, PyMuPDF can also be used as an image converter: it can read a range of input formats and can produce Portable Network Graphics (PNG), Portable Anymaps (PNM, etc.), Portable Arbitrary Maps (PAM), Adobe PostScript and Adobe Photoshop documents, making the use of other graphics packages obselete in these cases. But interfacing with e.g. PIL/Pillow for image input and output is easy as well.

For PDF documents, there exists a plethora of additional features: they can be created, joined or split up. Pages can be inserted, deleted, re-arranged or modified in many ways (including annotations and form fields).

  • Images and fonts can be extracted or inserted.

    You may want to have a look at this cool GUI example script, which lets you insert, delete, replace or re-position images under your visual control.

    If fontTools is installed, subsets can be built for eligible fonts based on their usage in the document. Especially for new PDFs, this can lead to significant file size reductions.

  • Embedded files are fully supported.

  • PDFs can be reformatted to support double-sided printing, posterizing, applying logos or watermarks

  • Password protection is fully supported: decryption, encryption, encryption method selection, permission level and user / owner password setting.

  • Support of the PDF Optional Content concept for images, text and drawings.

  • Low-level PDF structures can be accessed and modified.

  • Command line module "python -m fitz ...". A versatile utility with the following features

    • encryption / decryption / optimization
    • creation of sub-documents
    • document joining
    • image / font extraction
    • full support of embedded files
    • layout-preserving text extraction (all documents)

Have a look at the basic demos, the examples (which contain complete, working programs), and notebooks.

Documentation

Documentation is written using Sphinx and is available online. It is currently a combination of a reference guide and user manual.

  • You can view it online at Read the Docs. This site also provides download options for PDF.
  • For a quick start look at the tutorial and the recipes chapters.

The latest changelog can be viewed here.

Installation

PyMuPDF requires Python 3.8 or later.

For versions 3.8 and up, Python wheels exist for Windows (32bit and 64bit), Linux (64bit, Intel and ARM) and Mac OSX (64bit, Intel only), so it can be installed from PyPI in the usual way. To ensure pip support for the latest wheel platform tags, we strongly recommend to always upgrade pip first.

python -m pip install --upgrade pip
python -m pip install --upgrade pymupdf

There are no mandatory external dependencies. However, some optional features become available only if additional packages are installed:

  • Pillow for using pillow image output directly from PyMuPDF
  • fontTools for creating font subsets.
  • pymupdf-fonts contains some nice fonts for your text output.
  • Tesseract-OCR for optical character recognition in images and document pages. Tesseract is separate software, not a Python package. To enable OCR functions in PyMuPDF, the system environment variable "TESSDATA_PREFIX" must be defined and contain the tessdata folder name of the Tesseract installation location.

Older wheels - also with support for older Python versions - can be found here and on PyPI.

Note: If pip cannot find a wheel that is compatible with your platform, it will automatically build and install from source using the PyMuPDF sdist; this requires only that SWIG is installed on your system.

Alternative 'rebased' implementation.

A new implementation of PyMuPDF is available as module fitz_new.

Benefits

  • Access to the underlying MuPDF Python API.

    The MuPDF Python API is available as fitz_new.mupdf - this is not possible with native PyMuPDF, and can give useful flexibility to the user.

  • Simplified implementation.

    The underlying MuPDF C++/Python APIs' automated reference counting, automatic contexts, and native C++ and Python exceptions, make the implementation simpler than classic PyMuPDF.

    This also simplifies development of new PyMuPDF functionality.

  • Optional tracing of MuPDF C function calls using environment variables.

    This is a feature of the MuPDF C++ and Python APIs, which can be very useful during development and when reporting bugs. See: https://mupdf.readthedocs.io/en/latest/language-bindings.html#environmental-variables

  • Possible future support for multithreaded use.

    Classic PyMuPDF is explicitly single-threaded, but the MuPDF C++/Python APIs have automated per-thread contexts.

Known issues

  • import fitz_new is known to fail with a SEGV on Windows with Python-3.10.

Secondary wheel PyMuPDFb

Installation of PyMuPDF with pip will automatically install a second wheel called PyMuPDFb containing Python-independent libraries.

License and Copyright

PyMuPDF and MuPDF are available under both, open-source AGPL and commercial license agreements.

Please read the full text of the AGPL license agreement (which is also included here in file COPYING) to ensure that your use case complies with the guidelines of this license. If you determine you cannot meet the requirements of the AGPL, please contact Artifex for more information regarding a commercial license.

Artifex is the exclusive commercial licensing agent for MuPDF.

Artifex, the Artifex logo, MuPDF, and the MuPDF logo are registered trademarks of Artifex Software Inc. PyMuPDF and the PyMuPDF logo are trademarks of Artifex Software, Inc. © 2022 Artifex Software, Inc. All rights reserved.

Contact

Please use the Discussions menu for questions, comments, or asking for help, and submit issues here.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyMuPDF-1.23.0rc1.tar.gz (60.1 MB view details)

Uploaded Source

Built Distributions

PyMuPDF-1.23.0rc1-cp311-none-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.11 Windows x86-64

PyMuPDF-1.23.0rc1-cp311-none-win32.whl (3.2 MB view details)

Uploaded CPython 3.11 Windows x86

PyMuPDF-1.23.0rc1-cp311-none-manylinux2014_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.11

PyMuPDF-1.23.0rc1-cp311-none-macosx_10_9_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

PyMuPDF-1.23.0rc1-cp310-none-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.10 Windows x86-64

PyMuPDF-1.23.0rc1-cp310-none-win32.whl (3.2 MB view details)

Uploaded CPython 3.10 Windows x86

PyMuPDF-1.23.0rc1-cp310-none-manylinux2014_x86_64.whl (4.3 MB view details)

Uploaded CPython 3.10

PyMuPDF-1.23.0rc1-cp310-none-macosx_10_9_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

PyMuPDF-1.23.0rc1-cp39-none-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

PyMuPDF-1.23.0rc1-cp39-none-win32.whl (3.2 MB view details)

Uploaded CPython 3.9 Windows x86

PyMuPDF-1.23.0rc1-cp39-none-manylinux2014_x86_64.whl (4.2 MB view details)

Uploaded CPython 3.9

PyMuPDF-1.23.0rc1-cp39-none-macosx_10_9_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

PyMuPDF-1.23.0rc1-cp38-none-win_amd64.whl (3.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

PyMuPDF-1.23.0rc1-cp38-none-win32.whl (3.2 MB view details)

Uploaded CPython 3.8 Windows x86

PyMuPDF-1.23.0rc1-cp38-none-manylinux2014_x86_64.whl (4.2 MB view details)

Uploaded CPython 3.8

PyMuPDF-1.23.0rc1-cp38-none-macosx_10_9_x86_64.whl (4.1 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

File details

Details for the file PyMuPDF-1.23.0rc1.tar.gz.

File metadata

  • Download URL: PyMuPDF-1.23.0rc1.tar.gz
  • Upload date:
  • Size: 60.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for PyMuPDF-1.23.0rc1.tar.gz
Algorithm Hash digest
SHA256 db475c54be82caa56375bea2dd7ad7f6c696c8f25368b328c741b123d0676897
MD5 7778326a472ea95ef8eb6e8c5fd38aad
BLAKE2b-256 2e229114db17818b426358caf0d8404425f3b70612712bd8fb902688b06a3002

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 dae153011d0feb3a8c33d45823a560405b1b6d48fa44bf646d64c48d2af1481f
MD5 72744e5494c45dace6fe013f286ddf8d
BLAKE2b-256 9993d0964ed0b03eeb137052f327f2b397966c07c4c7ca5d4ebd4d3d349f5b5b

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp311-none-win32.whl.

File metadata

  • Download URL: PyMuPDF-1.23.0rc1-cp311-none-win32.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.11, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp311-none-win32.whl
Algorithm Hash digest
SHA256 7427a85af21ac303a8180cf31f93df9a9e19709eac04e7ad30d2b7a7b67ed515
MD5 54df4e6f75961b2145c7bbd845b07814
BLAKE2b-256 09d5439a21627afe859a66424efd084c832c31778a013cec31512fae564c6d83

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp311-none-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp311-none-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9fce4c3ff923cc9777a36a82b0265b2b7fb70b025135569496e33303c5fbeeea
MD5 9e24f6ff8654a9dbc59a4551786526b3
BLAKE2b-256 9c930da9150d787aba6bef1cd323ddd79aea4669877fdcd7022a6e8a3d5c5d60

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp311-none-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp311-none-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 56112dde6feb7256eecbfb79b3e8e9ea456dcc07b8ec03e776d5fce239cb6571
MD5 99d1c4cfaebb7d04f7f720b9d41ada43
BLAKE2b-256 12eaf9adb8748d9baea3e44bd7729bf8069d14c13cce92bcf6fd3480ac6944b2

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 bcbae01d1c5bfd0d53a52c4aed7429b96f85a820895e9b56647c81f4a85559f4
MD5 a88d03823f1065f80a25394efea5aec2
BLAKE2b-256 ad6ffc5aedf0d95ec0eeb5a07d4efc0f6e019cb1346edc89c96bb6b0dbbc2bf1

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp310-none-win32.whl.

File metadata

  • Download URL: PyMuPDF-1.23.0rc1-cp310-none-win32.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.10, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp310-none-win32.whl
Algorithm Hash digest
SHA256 2956b77e0e33e5a7cfebb90699d07367f703c3c8009fc16e739b08ea6e1064ac
MD5 c6c87cda0fed6571e739eedf199d46b8
BLAKE2b-256 62dbdf072d28352d96789c49b6a23ff3f630555d8ff8d7a5d6e826beab81fc15

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp310-none-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp310-none-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 105db47fbd6ad197a03062c542308eb4a1a13197633a7f69553f24cb3dcee371
MD5 e2c58f86ee2ee12601f53595e76c2ac2
BLAKE2b-256 05403914191e4002acebbc31f008ed3326c3e3fe76487a8bc1da60f72ed7d834

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp310-none-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp310-none-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6698a1950edf9406c5ff6818082c28c31484fcc386f0d434288fd5b6a3abc591
MD5 829efc4d45e85ebede84997ac03f6bca
BLAKE2b-256 f39b34251341d98442945710dfd8b84da43a75c563ac91ac0ce60f80b25ed66c

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 75eb55da6a22fd4ee1ec3202d5fa31a2304c5f24348583cef0fc4d10a66963f8
MD5 15cd12550436fad72c8ad2ab8af1d172
BLAKE2b-256 8951ce998d87021d7e9f935fe9c072ef759520bd57ef68208d6553fcf594afc5

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp39-none-win32.whl.

File metadata

  • Download URL: PyMuPDF-1.23.0rc1-cp39-none-win32.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp39-none-win32.whl
Algorithm Hash digest
SHA256 6d064b571eeeb3f59c3c02f94af7901b21eb114c54b1c498e520db7cdb9fc7f4
MD5 834cbf38376856e6cd23d3f549c32966
BLAKE2b-256 0cb1f93f952796420957c887fb9b4432cb09446ec56c84da0ba7acd02efbd125

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp39-none-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp39-none-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0b781c4b2021d1da7d9b97fa4db6f70b0f57be91860ad6ac2ae5c28fb83f7982
MD5 8adc8b358da6fdf0a85d89293a9136b0
BLAKE2b-256 38096b251dfa25489f5a1037d568efb9bd1ff5582922b452a68d5afbe6115efe

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp39-none-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp39-none-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 af927d984ac20ca9eea7eed3600eadc433ed8000ba2b00fab4bfac63516e3650
MD5 003b7bcba7b858029fb305ff4b345087
BLAKE2b-256 8bbb1360e06d248ec82c19b70380a7200d0f82184cb1a3eea4982305691f3594

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp38-none-win_amd64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 002d41c1f1714674210a74e3d83e7c63b6419ee5bba8bab2c43106696f609e29
MD5 ebdc1be4f0fc218af6ee9d21462fb183
BLAKE2b-256 25886287f2e188504938760ea9ff55a11b033cb0f34a70a0898e50e58008e6cf

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp38-none-win32.whl.

File metadata

  • Download URL: PyMuPDF-1.23.0rc1-cp38-none-win32.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp38-none-win32.whl
Algorithm Hash digest
SHA256 2ddb61d6a3bc763ea8340eabe1097c78c3968e7cf6615710cd8479fa7c3d221c
MD5 bc76579ac0ca824483f0e85501d8d466
BLAKE2b-256 d46a80082ec9e5ba17427f8cff3eeb53bacb0e9c39fc1236b3621ca1377caa3e

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp38-none-manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp38-none-manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8eb27775c489aef6e05a0cb3ce069a5a4a6a225ff47b4801d0ef577be4dc48d8
MD5 9f8eddcb52e2a9da2aa85e83d4b380f5
BLAKE2b-256 a39cbf74a054b18ec055e4705dd74c8b19c0931425135d87ab3bd034ca52b0d2

See more details on using hashes here.

File details

Details for the file PyMuPDF-1.23.0rc1-cp38-none-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for PyMuPDF-1.23.0rc1-cp38-none-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 de94f87eb3c6a8674ad05a1dcfd8d813c4b26e6c45e40a7fd614b6c94b651d6f
MD5 5397f07c836e957661f39f0015e4d568
BLAKE2b-256 13855b71f2e86a3ddeb85bc75ae4c96060b5fe529738a790f5c336ce7b4d5c55

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page