Skip to main content

Powerful and Pythonic PDF processing library based on xpdf-4.02

Project description

pyxpdf is a fast and memory efficient python module for parsing PDF documents based on xpdf reader sources.


Read the Docs


Azure DevOps builds (branch) Travis (.com) Codecov


PyPI PyPI - Python Version PyPI - Wheel PyPI - Downloads




  • Almost x20 times faster than pure python based pdf parsers (see Speed Comparison)

  • Extract text while maintaining original document layout (best possible)

  • Support almost all PDF encodings, CMaps and predefined CMaps.

  • Extract LZW, RLE, CCITTFax, DCT, JBIG2 and JPX compressed images and image masks along with their BBox.

  • Render PDF Pages as image with support of ‘1’, ‘L’, ‘LA’, ‘RGB’, ‘RGBA’ and ‘CMYK’ color modes.

  • No explict dependencies (except optional ones, see Installation)

  • Thread Safe

More Information


pyxpdf is licensed under the GNU General Public License (GPL), version 3. See the LICENSE


Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyxpdf-0.2.3.tar.gz (1.9 MB view hashes)

Uploaded source

Built Distributions

pyxpdf-0.2.3-cp38-cp38-win_amd64.whl (1.3 MB view hashes)

Uploaded cp38

pyxpdf-0.2.3-cp38-cp38-win32.whl (1.1 MB view hashes)

Uploaded cp38

pyxpdf-0.2.3-cp37-cp37m-win_amd64.whl (1.3 MB view hashes)

Uploaded cp37

pyxpdf-0.2.3-cp37-cp37m-win32.whl (1.1 MB view hashes)

Uploaded cp37

pyxpdf-0.2.3-cp36-cp36m-win_amd64.whl (1.3 MB view hashes)

Uploaded cp36

pyxpdf-0.2.3-cp36-cp36m-win32.whl (1.1 MB view hashes)

Uploaded cp36

pyxpdf-0.2.3-cp35-cp35m-win_amd64.whl (1.3 MB view hashes)

Uploaded cp35

pyxpdf-0.2.3-cp35-cp35m-win32.whl (1.1 MB view hashes)

Uploaded cp35

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page