PDFMiner Wrapper & Other PDF utilities
Project description
pdf-wrangler
PDFMiner wrapper used to simplify PDF extraction. More functionalities to come to make it a more general purpose PDF utility tool.
Document class
The Document
class is used to represent a PDF document. It contains functionality to access the raw text by page, PDF metadata and images in the form of PDFMiner's LTImage
objects.
Example Usage
from pdf_wrangler import Document
pdf_document = Document('path/to/pdf', password='optional password')
# to access pdf metadata
pdf_document.get_metadata()
# to access pdf text
pdf_document.get_text()
# to access pdf text on first page
pdf_document.pages[0].get_text()
Installation
To install, run:
pip install pdf-wrangler
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pdf_wrangler-0.0.24.tar.gz
(3.7 kB
view hashes)
Built Distribution
Close
Hashes for pdf_wrangler-0.0.24-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d645ca4c7213a67efe151134b7bd5daed7850fd5dd9efab5118ede183a284c46 |
|
MD5 | 8188a55975c7e4551e73091cddd9280a |
|
BLAKE2b-256 | 6022dbc6a74ffe9e553538d566ea665db370bb327734884adaf7bafc3fbe2deb |