Skip to main content

Python interface to Apache PDFBox command-line tools.

Project description

Package Description

Provides a simple Python 3 interface to the Apache PDFBox command-line tools.

Latest Version

Requirements

Aside from Python 3 and those packages specified in setup.py, python-pdfbox requires java to be present in the system path.

Installation

The package may be installed as follows:

pip install python-pdfbox

One may specify the location of the PDFBox jar file via the PDFBOX environmental variable. If not set, python-pdfbox looks for the jar file in the platform-specific user cache directory and automatically downloads and caches it if not present.

Usage

The interface currently exposes only two features in PDFBox: text extraction and conversion to images:

import pdfbox
p = pdfbox.PDFBox()
text = p.extract_text('/path/to/my_file.pdf')
p.pdf_to_images('/path/to/my_file.pdf')

Development

The latest release of the package may be obtained from GitHub.

Author

See the included AUTHORS.rst file for more information.

License

This software is licensed under the Apache 2.0 License. See the included LICENSE.rst file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-pdfbox-0.1.5.tar.gz (32.3 kB view details)

Uploaded Source

Built Distribution

python_pdfbox-0.1.5-py3-none-any.whl (5.6 kB view details)

Uploaded Python 3

File details

Details for the file python-pdfbox-0.1.5.tar.gz.

File metadata

  • Download URL: python-pdfbox-0.1.5.tar.gz
  • Upload date:
  • Size: 32.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.1

File hashes

Hashes for python-pdfbox-0.1.5.tar.gz
Algorithm Hash digest
SHA256 4520f6a711d848fad22b6dc92a4e25ec4d539102f015991f30aeb116040cf5a2
MD5 b61076a1dca6ba6032562b81f659fe95
BLAKE2b-256 e5115942e30dc0d9dd01c34631be0d08887c14abade60e933338a53d0b9db722

See more details on using hashes here.

File details

Details for the file python_pdfbox-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: python_pdfbox-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 5.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.1

File hashes

Hashes for python_pdfbox-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 14e4bd757347a60ca98202d46152ed1c70196e7b6b1bd2d76a0683b3d23dc2e9
MD5 65560612a7b6c2d5a2374896f3c0f9f8
BLAKE2b-256 adfbe47e39b7525eb654edde5d34202abe26b6951cbd3e828a181c8314df2ccc

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page