Compresses PDFs with PNG compression.
Project description
File Crusher
Compresses PDFs with PNG compression.
Tired of bumping against upload size limits? This tool is perfect to compress PDFs and PNGs by combining some of the best compression tools in one. While it can be slow, it really crushes your filesize and helps you to conquer the relentless 5MB upload limit.
It works by splitting up a PDF into PNGs and compress these with advpng, pngcrush and pngquant. Then it combines them back into a PDF and applies a round of lossless pdf compression. Optionally it can apply OCR - Optical Character Recognition to make a scanned PDF searchable. Additionally, it exposes internal processors enabling you to use it as png compressor and file converter.
Installation
1. Install the python library
pip install file-crusher
2. Install the Compression Tools
windows
already pre-installed in compressor_lib directory
Linux(ubuntu)
sudo apt install pngquant -y && sudo apt install advancecomp -y && sudo apt install pngcrush -y
and install wine for cpdfsqueeze
apt install wine -y
3. optionally Install pytesseract for OCR
For Windows via GUI
Download and Install Tesseract Select Additional Languages that you want. (f.e German under Additional Language Data)
Linux
apt install tesseract-ocr
add additional language packs
apt install tesseract-ocr-<language-shortform> -y
example for german
apt install tesseract-ocr-deu -y
Usage
CLI Usage
# for pdfs
python3 -m file_crusher input.pdf output.pdf --pdfcompressor
# or for pngs
python3 -m file_crusher input.png output.png --pngcompressor
# for other processors see
python3 -m file_crusher --help
Python Usage
from file_crusher import PNGCompressor, PDFCompressor
compressor = PNGCompressor()
compressor.process_file("input.png", "output.png")
# extreme mode
compressor = PNGCompressor(0)
compressor.process_file("input.png", "output.png")
# fast mode
compressor = PNGCompressor(5)
compressor.process_file("input.png", "output.png")
# also check the other options
compressor = PDFCompressor(default_pdf_dpi=200)
compressor.process_file("input.pdf", "output.pdf")
Disclaimer
It's important to note that lossy compression results in loss of quality or data. Therefore, it's always a good idea to test the output file to make sure it meets your requirements.
If you encounter any challenges while using the library or have suggestions for its improvement, I invite you to please create an issue. https://github.com/pIlIp-d/FileCrusher/issues
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file filecrusher-0.2.5.tar.gz
.
File metadata
- Download URL: filecrusher-0.2.5.tar.gz
- Upload date:
- Size: 2.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 995fab718b3da92a7eeb1d50d9c771a39fdb8a5fc8163593da699a61213a5b2e |
|
MD5 | 217456bb935959a186da3376a206b3d6 |
|
BLAKE2b-256 | 6c1856bbf426b4e3247adab3797625da134733e8bbfbc7b5f1fb0d6da5ac2296 |
File details
Details for the file filecrusher-0.2.5-py3-none-any.whl
.
File metadata
- Download URL: filecrusher-0.2.5-py3-none-any.whl
- Upload date:
- Size: 2.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c689af18d8ff1e9ea20b01d40472ad75348fb5f46fb023ca96161e4def58ef45 |
|
MD5 | ab366a37f8594513059267967b6fa58d |
|
BLAKE2b-256 | 4295c9f27c476ba2401c3e47fbb0dd1758a26003f3dd9ee46167e882cd57208c |