Crop, Rotate, and extract text from your PDFs so you can delete them
Project description
Delete-Your-PDF
Delete your PDF is a set of tools to export information from your PDFs so you can delete them.
Image files can be taken in as both base64 strings or BytesIO objects
Pip Library: https://pypi.org/project/delete-your-pdf/
Pip Repo: https://github.com/darefail/Delete-Your-PDF
Live Demo
Live Demo: https://pdf.darefail.com
Demo Opensource Repo: https://github.com/DareFail/AI-Video-Boilerplate-Pro
Installation
pip install delete-your-pdf
How to use
countPdfPages: Counts the number of pages in a PDF and returns an int
from deleteYourPDF import countPdfPages
numberOfPages = countPdfPages(file="PDF_FILE_HERE")
pdfToImagePages: Convert PDF to a list of pages that are PNG images as a base64 strings
from deleteYourPDF import pdfToImagePages
# Return a list containing all pages in order as images
listOfImagePages = pdfToImagePages(file="PDF_FILE_HERE")
# Return a list containing only an image of page 7
listOfImagePages = pdfToImagePages(file="PDF_FILE_HERE", page_number=7)
imageWidthHeight: get the width and height of an image as a dictionary in pixels {width: 100, height: 100}
from deleteYourPDF import imageWidthHeight
image_dimensions = imageWidthHeight(file="IMAGE_FILE_HERE")
width = image_dimensions["width"]
height = image_dimensions["height"]
cropRotateImage: Crop and rotate an image and return a PNG image as a base64 string
from deleteYourPDF import cropRotateImage
# Returns an image of the top left 100x100 square from an image and rotates it 90 degrees to the right, the new image dimensions will match the rotation
croppedAndRotatedImage = cropRotateImage(file="IMAGE_FILE_HERE", x=0, y=0, width=100, height=100, rotation_degrees=90)
# Returns an image of the top left 100x100 square from an image and keep the original image dimensions
croppedAndRotatedImage = cropRotateImage(file="IMAGE_FILE_HERE", x=0, y=0, width=100, height=100, rotation_degrees=30, expand_for_rotation=False)
imageToText_Roboflow: Convert image to text with Roboflow OCR and returns a string
from deleteYourPDF import imageToText_Roboflow
# Returns the text from a local image file
text = imageToText_Roboflow(file="IMAGE_FILE_HERE", api_key="ROBOFLOW_API_KEY_HERE")
Example 1: Convert the top 100 pixels of all pages of a PDF to a list of text
from deleteYourPDF import countPdfPages, pdfToImagePages, imageToText_Roboflow, cropRotateImage, imageWidthHeight
listOfText = []
listOfImagePages = pdfToImagePages(file="PDF_FILE_HERE")
for imagePage in listOfImagePages:
image_dimensions = imageWidthHeight(file=imagePage)
width = image_dimensions["width"]
height = image_dimensions["height"]
croppedAndRotatedImage = cropRotateImage(file=imagePage, x=0, y=0, width=width, height=100)
listOfText.append(imageToText_Roboflow(file=croppedAndRotatedImage, api_key="ROBOFLOW_API_KEY_HERE"))
return listOfText
Example 2: Rotate a 100x100 box in the center of page 7 90 degrees to the right on a PDF and print the text
from deleteYourPDF import countPdfPages, pdfToImagePages, imageToText_Roboflow, cropRotateImage
if countPdfPages(file="PDF_FILE_HERE") > 7:
imagePage = pdfToImagePages(file="PDF_FILE_HERE", page_number=7)
image_dimensions = imageWidthHeight(file=imagePage)
width = image_dimensions["width"]
height = image_dimensions["height"]
x = (width - 100)/2
y = (height - 100)/2
croppedAndRotatedImage = cropRotateImage(file=imagePage, x=x, y=y, width=100, height=100, rotation_degrees=90)
return imageToText_Roboflow(file=croppedAndRotatedImage, api_key="ROBOFLOW_API_KEY_HERE")
Acknowledgements
Thanks to Roboflow for sponsoring this project. Get your free API key at: Roboflow
License
Distributed under the APACHE 2.0 License. See LICENSE
for more information.
Contact
Twitter: @darefailed
Youtube: How to Video coming soon
Project Link: https://github.com/darefail/Delete-Your-PDF
Update Package
python3 -m build
python3 -m twine upload dist/*
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file delete_your_pdf-1.0.3.tar.gz
.
File metadata
- Download URL: delete_your_pdf-1.0.3.tar.gz
- Upload date:
- Size: 7.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e7f25b9ef1fde1768e6aa705f0658db2d816d615478332527363ad7d436d0b86 |
|
MD5 | 2af418838215df5255558dffdbc074d5 |
|
BLAKE2b-256 | 64813c7a112e0371bd5f0fbbf03b2d9713d32571d4c3a7bf788edcb325e7bb18 |
File details
Details for the file delete_your_pdf-1.0.3-py3-none-any.whl
.
File metadata
- Download URL: delete_your_pdf-1.0.3-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81ad63ae29f3f4d57b01cb3e4bad95c56b748da07c936ee0c04018e1a3fff0f8 |
|
MD5 | 339a30de8a65e3ca992eed0ff60e3067 |
|
BLAKE2b-256 | 7df9f6e143afc4a89d69e9b52c962af6b457734488406177500bdbc593b430be |