Skip to main content

A high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Project description

PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Installation

PyMuPDF requires Python 3.8 or later, install using pip with:

pip install PyMuPDF

There are no mandatory external dependencies. However, some optional features become available only if additional packages are installed.

You can also try without installing by visiting PyMuPDF.io.

Usage

Basic usage is as follows:

import fitz # imports the pymupdf library
doc = fitz.open("example.pdf") # open a document
for page in doc: # iterate the document pages
  text = page.get_text() # get plain text encoded as UTF-8

Documentation

Full documentation can be found on pymupdf.readthedocs.io.

Optional Features

  • fontTools for creating font subsets.
  • pymupdf-fonts contains some nice fonts for your text output.
  • Tesseract-OCR for optical character recognition in images and document pages.

About

PyMuPDF adds Python bindings and abstractions to MuPDF, a lightweight PDF, XPS, and eBook viewer, renderer, and toolkit. Both PyMuPDF and MuPDF are maintained and developed by Artifex Software, Inc.

PyMuPDF was originally written by Jorj X. McKie.

License and Copyright

PyMuPDF is available under open-source AGPL and commercial license agreements. If you determine you cannot meet the requirements of the AGPL, please contact Artifex for more information regarding a commercial license.

Contact

Join us on Discord here: #pymupdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

AqPyMuPdf-1.23.7.tar.gz (78.4 MB view details)

Uploaded Source

Built Distribution

AqPyMuPdf-1.23.7-cp310-none-macosx_12_0_x86_64.whl (34.3 MB view details)

Uploaded CPython 3.10 macOS 12.0+ x86-64

File details

Details for the file AqPyMuPdf-1.23.7.tar.gz.

File metadata

  • Download URL: AqPyMuPdf-1.23.7.tar.gz
  • Upload date:
  • Size: 78.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.13

File hashes

Hashes for AqPyMuPdf-1.23.7.tar.gz
Algorithm Hash digest
SHA256 29b3c23e0c80e81b0a4792f82660d2b7f3602e1ed44a72d56cd3952b00cbf1ce
MD5 5b8bcf659b7b0167467c7240cf40b290
BLAKE2b-256 450e8b6214af6662a43e74e95899966045a13ed6959122731a54472dc0ea9109

See more details on using hashes here.

File details

Details for the file AqPyMuPdf-1.23.7-cp310-none-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for AqPyMuPdf-1.23.7-cp310-none-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 faabbcb1054b997e66b079107a1fa5e6c3bd9bcaebfe3f59fe9addf1c0feec76
MD5 5f1abe739e9907b4c1d6e01badde2ded
BLAKE2b-256 2fcf8ee4485b20bef5057fe695a82eff87fdd6fd203ad6721da205ff82366011

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page