Skip to main content

Powerful and Pythonic PDF processing library based on xpdf-4.02

Project description

pyxpdf is a fast and memory efficient python module for parsing PDF documents based on xpdf reader sources.

docs Read the Docs
tests Azure DevOps builds (branch) Travis (.com) Codecov
package PyPI PyPI - Python Version PyPI - Wheel PyPI - Downloads
license GitHub

Features

  • Almost x20 times faster than pure python based pdf parsers (see Speed Comparison)
  • Extract text while maintaining original document layout (best possible)
  • Support almost all PDF encodings, CMaps and predefined CMaps.
  • Extract LZW, RLE, CCITTFax, DCT, JBIG2 and JPX compressed images and image masks along with their BBox.
  • Render PDF Pages as image with support of ‘1’, ‘L’, ‘LA’, ‘RGB’, ‘RGBA’ and ‘CMYK’ color modes.
  • No explict dependencies (except optional ones, see Installation)
  • Thread Safe

License

pyxpdf is licensed under the GNU General Public License (GPL), version 3. See the LICENSE

Credits

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyxpdf-0.2.3.tar.gz (1.9 MB view hashes)

Uploaded source

Built Distributions

pyxpdf-0.2.3-cp38-cp38-win_amd64.whl (1.3 MB view hashes)

Uploaded cp38

pyxpdf-0.2.3-cp38-cp38-win32.whl (1.1 MB view hashes)

Uploaded cp38

pyxpdf-0.2.3-cp37-cp37m-win_amd64.whl (1.3 MB view hashes)

Uploaded cp37

pyxpdf-0.2.3-cp37-cp37m-win32.whl (1.1 MB view hashes)

Uploaded cp37

pyxpdf-0.2.3-cp36-cp36m-win_amd64.whl (1.3 MB view hashes)

Uploaded cp36

pyxpdf-0.2.3-cp36-cp36m-win32.whl (1.1 MB view hashes)

Uploaded cp36

pyxpdf-0.2.3-cp35-cp35m-win_amd64.whl (1.3 MB view hashes)

Uploaded cp35

pyxpdf-0.2.3-cp35-cp35m-win32.whl (1.1 MB view hashes)

Uploaded cp35

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page