CLI utility to remove PII watermarks from pdfs downloaded from Move USP/ESALQ
Project description
Move Unmarker
Very small CLI utility to remove PII watermarks from pdfs downloaded from Move USP/ESALQ, using PyMuPDF.
Beware that there is no input sanitization or error checking, you are on your own. This tool will run, as opposed to work, without fail on any
pdf that has at least one content stream per page, which is basically every pdf in the wild. Unless the pdf has a watermark corresponding to the 2nd
content stream of every page this will either do nothing (with the exception of changing compression options and maybe other idiosyncrasies of PyMuPDF
when it comes to writing a pdf) or, in case it does have a 2 or more content streams on a page, it will keep just the first and likely make the file useless,
though it will still open. Most pdf writers concatenate multiple content streams into one, so chances are it won't do anything or just crash.
This tool will overwrite without confirmation any file with the same name as --output
(default "unmarked.pdf").
Installation
- Make sure Python 3.8 or higher and pip are installed
- Run
pip install move-unmarker
Usage
usage: unmarker [-h] [-o OUTPUT] [-g GARBAGE] input
Utility to remove PII watermarks from pdfs downloaded from Move USP/ESALQ.
positional arguments:
input input filename
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
output filename (default: "unmarked.pdf")
-g GARBAGE, --garbage GARBAGE
level of garbage collection (default: 1)
pymupdf.Document.save method for more details on garbage collection.
TLDR
unmarker watermarked.pdf
unmarker -o unmarked.pdf watermarked.pdf
unmarker --garbage 3 watermarked.pdf
Development
-
Check Python's version
python -V
-
Install Python 3.8 or higher and pip, if they aren't already installed:
- Windows
winget install Python.Python.3.X
(replace X with the desired minor version) - Ubuntu/Debian based distros
apt install python3 python3-pip
- Arch based distros
pacman -S python python-pip
- Fedora
dnf install python3 python3-pip
- Windows
-
Clone this repo
git clone https://github.com/joaofauvel/move-unmarker.git && cd move-unmarker
-
Install requirements
poetry install
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file move_unmarker-0.1.4.tar.gz
.
File metadata
- Download URL: move_unmarker-0.1.4.tar.gz
- Upload date:
- Size: 15.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.2 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d26070d13b193f0d6cc24212761d93578302f16e547ce95f9e450cb393dd6af |
|
MD5 | c8b4ff096fc638ffa08ea0f61b38f3e2 |
|
BLAKE2b-256 | 782c122b17474467fe4453c618fbe0e6933021c2ab7b06d227176b8689299bfe |
File details
Details for the file move_unmarker-0.1.4-py3-none-any.whl
.
File metadata
- Download URL: move_unmarker-0.1.4-py3-none-any.whl
- Upload date:
- Size: 15.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.2 CPython/3.12.2 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a67bd378ce123ded6aa3a447559ab1d7d1fa9ba1de5d536bd5c7628dde38b418 |
|
MD5 | 4e6127a1f50f219de586e0514625640e |
|
BLAKE2b-256 | bafbb627741a36ca9c6c40b173174a050f7cec8e96368becf21409fd268e297a |