Skip to main content

Python interface to Apache PDFBox command-line tools.

Project description

Package Description

Provides a simple Python 3 interface to the Apache PDFBox command-line tools.

Latest Version

Requirements

Aside from Python 3 and those packages specified in setup.py, python-pdfbox requires java to be present in the system path.

Installation

The package may be installed as follows:

pip install python-pdfbox

One may specify the location of the PDFBox jar file via the PDFBOX environmental variable. If not set, python-pdfbox looks for the jar file in the platform-specific user cache directory and automatically downloads and caches it if not present.

Usage

The interface currently exposes the text extraction feature of PDFBox only:

import pdfbox
p = pdfbox.PDFBox()
text = p.extract_text('/path/to/my_file.pdf')

Development

The latest release of the package may be obtained from GitHub.

Author

See the included AUTHORS.rst file for more information.

License

This software is licensed under the Apache 2.0 License. See the included LICENSE.rst file for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-pdfbox-0.1.4.tar.gz (30.1 kB view details)

Uploaded Source

Built Distribution

python_pdfbox-0.1.4-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file python-pdfbox-0.1.4.tar.gz.

File metadata

  • Download URL: python-pdfbox-0.1.4.tar.gz
  • Upload date:
  • Size: 30.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for python-pdfbox-0.1.4.tar.gz
Algorithm Hash digest
SHA256 847755dd099f7a66d4c9291bef02f0aa17f1a8bd31e2896798413d0169b08df9
MD5 d417c0a4d3324b6d50845a057c4f24e8
BLAKE2b-256 d72e049d1d2f8fff27f967844ecda17c3187d4ae50a9e627c4d10f4126ae3c52

See more details on using hashes here.

File details

Details for the file python_pdfbox-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: python_pdfbox-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.1

File hashes

Hashes for python_pdfbox-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 de931778d4de1868591b3f9da6cb9f5df93a6bc3e9a4ae4a5ac704cdda65761b
MD5 56b80752c7fc64e6519ae8014b28eef3
BLAKE2b-256 81387667b7ce4479460e2805755a206e9f06bed71d2e9fac08a6d7b1587fd7c6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page