Python script to make documents look like they were scanned.
Project description
look-like-scanned
-
Python script to make documents look like they were scanned.
-
Local, Private, Secure, Open-Source and Transparent!
-
Converts every page of a given PDF file into an image-based page and applies random askew and brightness (very mild) effects to simulate the appearance of scanned documents.
-
The resulting pages are then combined back into an Output PDF file.
-
Granular CLI options to combine / convert image files into PDF as well.
-
Supports conversion of multi page TIFF files and password protected PDF files.
-
Output PDF files are saved in the same folder with a suffix "File_Name_output.pdf"
Installation
Install from the Python Package Index (PyPI)
pip install look-like-scanned
Or to install latest version from GitHub
git clone https://github.com/navchandar/look-like-scanned.git
cd look-like-scanned
pip install poetry
poetry install
pip install .
Verify Installation:
# Print help message and usage options available
scanner -h
Usage
This package uses PIL and pypdfium2 to convert and manipulate image and pdf objects.
This is extended to provide a command-line interface (CLI) for easy usage.
# Convert all pdf files in folder to scanned pdf
scanner -i .\tests
scanner -i .\tests -f "pdf"
# Convert all pdf files in folder to scanned pdf, set contrast, sharpness and brightness factors
scanner -i .\tests -c 2 -sh 10 -br 2
# Convert all pdf files in folder to scanned without askew
scanner -i .\tests -a no
# Convert specific pdf file in folder to scanned pdf
scanner -i .\tests -f "test.pdf"
# Convert all jpg, jpeg, png, webp files in folder to one pdf file
scanner -i .\tests -f "image"
# Convert all image files in folder in the order of file names
scanner -i .\tests -f "image" -s "name"
# Convert all png files in folder to pdf with 100% quality to one pdf file
scanner -i .\tests -f "png" -q 100
# Convert specific jpg file in folder to pdf with 75% quality to one pdf file
scanner -i .\tests -f "JPG_Test.jpg" -q 75
# Convert all PDF files including sub folders
scanner -i .\tests -f "pdf" -r yes
# Convert all Images including sub folders into one PDF
scanner -i .\tests -f "image" -r yes
# Convert all PDF files including sub folders and save in black & white format
scanner -i .\tests -f "pdf" -r yes -b yes
# Convert all png files including sub folders and make it a little blurry
scanner -i .\tests -f "png" -r yes -b yes -l yes
# Convert all pdf files with a slight amount of noise (grain)
scanner -i .\tests -f "pdf" -n 2
# Convert all pdf files with depth of field (uneven blur)
scanner -i .\tests -f "pdf" -v yes
# Add noise, uneven blur, and make it look like a photocopy
scanner -i .\tests -f "pdf" -n 20 -v yes -b yes
# Convert specific image with heavy noise
scanner -i .\tests -f "test.jpg" -n 50
# Simulate a "Bad/Old Scanner" (Low quality, high noise, blur, and high contrast)
scanner -i .\tests -q 75 -n 15 -l yes -c 1.4 -sh 0.8
# Simulate a "High-Quality/Modern Scanner" (High quality, slight noise for texture, sharpened)
scanner -i .\tests -q 95 -n 3 -a yes -sh 1.5 -br 1.1
# Target specific image formats only (e.g., just HEIC files from an iPhone)
scanner -i .\photos -f "heic"
# Convert specific locked PDF with password and save output without password
scanner -i .\secure_docs -f "Locked_doc.pdf" -p p@ss123
# Interactive Mode: Process encrypted PDFs by entering passwords when prompted
scanner -i .\secure_docs -f "pdf"
Arguments
These are the command-line arguments accepted:
-
-i, --input_folder: Specifies the input folder to read files from and convert. The default value is the current directory.- Example:
-i /path/to/filesor-i C:\files\documents
- Example:
-
-f, --file_type_or_name: Specifies the file types to process or the file name to convert. The default value is "pdf" to convert all pdf files in the given input folder.- Example:
-f pdfor-f image.jpgor-f image
- Example:
-
-q, --file_quality: Specifies the quality of the converted output files. The value must be between 50 and 100. The default value is 95.- Example:
-q 90
- Example:
-
-a, --askew: Controls whether to make the output documents slightly askew or slightly tilted. Accepted values are "yes" or "no". The default value is "yes".- Example:
-a yesor--askew no
- Example:
-
-b, --black_and_white: Controls whether to save output documents in black and white format (to make it look like a photocopy) . Accepted values are "yes" or "no". The default value is "no".- Example:
-b yesor--black_and_white no
- Example:
-
-l, --blur: Controls whether to make the output a little bit blurry. Accepted values are "yes" or "no". The default value is "no".- Example:
-l yesor--blur no
- Example:
-
-v, --variation: Controls whether to apply a variable blur effect (depth of field simulation) to the image. This simulates a scanner lid that wasn't closed perfectly flat, causing one part of the document to be slightly out of focus. Accepted values are "yes" or "no". The default value is "no".- Example:
-v yesor--variation no
- Example:
-
-n, --noise: Controls the amount of salt-and-pepper noise added to the image to simulate dust or scanner sensor imperfections. The value must be an integer between 0 and 100. A value of 0 means no noise, while 50 is significantly noisy. The default value is 0.- Example:
-n 10or--noise 50
- Example:
-
-c, --contrast: Controls contrast factor of the image. A factor of 0.0 gives a solid gray image. A factor of 1.0 gives the original image. Greater values increase the contrast of the image. The default value is 1.- Example:
-c 2
- Example:
-
-sh, --sharpness: Controls sharpness factor of the image. A factor of 0.0 gives a blurred image. A factor of 1.0 gives the original image. Greater values increase the sharpness of the image. The default value is 1.- Example:
-sh 2
- Example:
-
-br, --brightness: Controls brightness factor of the image. A factor of 0.0 gives a black image. A factor of 1.0 gives the original image. Greater values increase the brightness of the image. The default value is 1.- Example:
-br 2
- Example:
-
-r, --recurse: Allows scripts to find all matching files including subdirectories. Accepted values are "yes" or "no". The default value is "yes".- Example:
-r yesor--recurse no
- Example:
-
-s, --sort_by: Allows scripts to sort the files based on name, creation time or modified time. Accepted values are "name", "ctime", "mtime", "none". The default value is "name". If "none" is selected, then the default order of files returned by the OS is used for document conversion.- Example:
-s nameor--sort_by none
- Example:
-
-p, --password: Password for decrypting locked PDF files. By default, if omitted, the script will pause and prompt you for a password whenever it encounters a locked file. Use this flag if all your PDF files share the same password. If files have different passwords, omit this and enter them one-by-one when prompted.- Example:
-p p@ss123or--password p@ss123
- Example:
❗❗ Note: ❗❗
-
The supported file types are: ".pdf", .jpg", ".jpeg", ".png", ".webp", ".tiff", ".tif", ".jp2", ".bmp"
-
The output PDF file size will be bigger than the input file because the pages are stored in image format.
-
Bookmarks / Links / Metadata will be removed when saving the output file.
-
Transparency will be removed from png files when converting to pdf.
-
Password protected PDF files are also supported since v1.1.
License
Authors
Testing
Run tests with detailed output:
# Run all tests
poetry run pytest -v
# Run specific tests
poetry run pytest -k="cli"
Support This Project
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file look_like_scanned-1.1.0.tar.gz.
File metadata
- Download URL: look_like_scanned-1.1.0.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1030ab43b54ee1c52b6a14d93185c6184227050703f96e87c7e10c586ca9c134
|
|
| MD5 |
528fd743dc35399ed8c180bae1735a7d
|
|
| BLAKE2b-256 |
89777eded9241f9ff28caf7b89bb25b982adc87b512859978d5ec6efda2c6725
|
File details
Details for the file look_like_scanned-1.1.0-py3-none-any.whl.
File metadata
- Download URL: look_like_scanned-1.1.0-py3-none-any.whl
- Upload date:
- Size: 18.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9a59c91e72331862f3591b07c43bd9f6db1567a591c0fd9b42a613ed281dc33a
|
|
| MD5 |
9bd66db6629c8b568583478e30d7e761
|
|
| BLAKE2b-256 |
c460ac633e7c48eeedfcd3d306882092c9cfde4005e79de5da2de665258ea2de
|