Skip to main content

Python interface to Apache PDFBox command-line tools.

Project description

Package Description

Provides a simple Python 3 interface to the Apache PDFBox command-line tools.

Latest Version

Requirements

Aside from Python 3 and those packages specified in setup.py, python-pdfbox requires java to be present in the system path.

Installation

The package may be installed as follows:

pip install python-pdfbox

One may specify the location of the PDFBox jar file via the PDFBOX environmental variable. If not set, python-pdfbox looks for the jar file in the platform-specific user cache directory and automatically downloads and caches it if not present.

Usage

The interface currently exposes only two features in PDFBox: text extraction and conversion to images:

import pdfbox
p = pdfbox.PDFBox()
text = p.extract_text('/path/to/my_file.pdf')
p.pdf_to_images('/path/to/my_file.pdf')

Development

The latest release of the package may be obtained from GitHub.

Author

See the included AUTHORS.rst file for more information.

License

This software is licensed under the Apache 2.0 License. See the included LICENSE.rst file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-pdfbox-0.1.6.tar.gz (81.0 kB view details)

Uploaded Source

Built Distribution

python_pdfbox-0.1.6-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file python-pdfbox-0.1.6.tar.gz.

File metadata

  • Download URL: python-pdfbox-0.1.6.tar.gz
  • Upload date:
  • Size: 81.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for python-pdfbox-0.1.6.tar.gz
Algorithm Hash digest
SHA256 c8ad67a4acaddd2faa96efed5173ca49f2fde8e1567b7de3893016aa1031752d
MD5 4e5e7f75fedeb2bbe427416bd12d9d30
BLAKE2b-256 f5a27befabba034d332353531f57299d1b6f2c09ade37e934b7e76509fcaaaff

See more details on using hashes here.

File details

Details for the file python_pdfbox-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: python_pdfbox-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3

File hashes

Hashes for python_pdfbox-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 b288f3c5db6795516fe370a7a128d4bbdc7e58ed1789f555436e517bc78429e2
MD5 efa798d68d219e389e6cc2240485ff82
BLAKE2b-256 0b05bed09a9bfd895200087be7c5efe81b3f375bd5453f849de582953b474d12

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page