Skip to main content

Python script to make documents look like they were scanned.

Project description

look-like-scanned

Pylint PyTest license Contributions Welcome made-with-python

  • Python script to make documents look like they were scanned.

  • Local, Private, Secure, Open-Source and Transparent!

  • Converts every page of a given PDF file into an image-based page and applies random askew and brightness (very mild) effects to simulate the appearance of scanned documents.

  • The resulting pages are then combined back into an Output PDF file.

  • Granular CLI options to combine / convert image files into PDF as well.

  • Supports conversion of multi page TIFF files and password protected PDF files.

  • Output PDF files are saved in the same folder with a suffix "File_Name_output.pdf"

Installation

Install from the Python Package Index (PyPI)

pip install look-like-scanned

Or to install latest version from GitHub

git clone https://github.com/navchandar/look-like-scanned.git
cd look-like-scanned
pip install poetry
poetry install
pip install .

Verify Installation:

# Print help message and usage options available
scanner -h

Usage

This package uses PIL and pypdfium2 to convert and manipulate image and pdf objects.

This is extended to provide a command-line interface (CLI) for easy usage.

# Convert all pdf files in folder to scanned pdf
scanner -i .\tests
scanner -i .\tests -f "pdf"

# Convert all pdf files in folder to scanned pdf, set contrast, sharpness and brightness factors
scanner -i .\tests -c 2 -sh 10 -br 2

# Convert all pdf files in folder to scanned without askew
scanner -i .\tests -a no

# Convert specific pdf file in folder to scanned pdf
scanner -i .\tests -f "test.pdf"

# Convert all jpg, jpeg, png, webp files in folder to one pdf file
scanner -i .\tests -f "image"

# Convert all image files in folder in the order of file names
scanner -i .\tests -f "image" -s "name"

# Convert all png files in folder to pdf with 100% quality to one pdf file
scanner -i .\tests -f "png" -q 100

# Convert specific jpg file in folder to pdf with 75% quality to one pdf file
scanner -i .\tests -f "JPG_Test.jpg" -q 75

# Convert all PDF files including sub folders
scanner -i .\tests -f "pdf" -r yes

# Convert all Images including sub folders into one PDF
scanner -i .\tests -f "image" -r yes

# Convert all PDF files including sub folders and save in black & white format
scanner -i .\tests -f "pdf" -r yes -b yes

# Convert all png files including sub folders and make it a little blurry
scanner -i .\tests -f "png" -r yes -b yes -l yes

# Convert all pdf files with a slight amount of noise (grain)
scanner -i .\tests -f "pdf" -n 2

# Convert all pdf files with depth of field (uneven blur)
scanner -i .\tests -f "pdf" -v yes

# Add noise, uneven blur, and make it look like a photocopy
scanner -i .\tests -f "pdf" -n 20 -v yes -b yes

# Convert specific image with heavy noise
scanner -i .\tests -f "test.jpg" -n 50

# Simulate a "Bad/Old Scanner" (Low quality, high noise, blur, and high contrast)
scanner -i .\tests -q 75 -n 15 -l yes -c 1.4 -sh 0.8

# Simulate a "High-Quality/Modern Scanner" (High quality, slight noise for texture, sharpened)
scanner -i .\tests -q 95 -n 3 -a yes -sh 1.5 -br 1.1

# Target specific image formats only (e.g., just HEIC files from an iPhone)
scanner -i .\photos -f "heic"

# Convert specific locked PDF with password and save output without password
scanner -i .\secure_docs -f "Locked_doc.pdf" -p p@ss123

# Interactive Mode: Process encrypted PDFs by entering passwords when prompted
scanner -i .\secure_docs -f "pdf"

Arguments

These are the command-line arguments accepted:

  • -i, --input_folder : Specifies the input folder to read files from and convert. The default value is the current directory.

    • Example: -i /path/to/files or -i C:\files\documents
  • -f, --file_type_or_name : Specifies the file types to process or the file name to convert. The default value is "pdf" to convert all pdf files in the given input folder.

    • Example: -f pdf or -f image.jpg or -f image
  • -q, --file_quality : Specifies the quality of the converted output files. The value must be between 50 and 100. The default value is 95.

    • Example: -q 90
  • -a, --askew : Controls whether to make the output documents slightly askew or slightly tilted. Accepted values are "yes" or "no". The default value is "yes".

    • Example: -a yes or --askew no
  • -b, --black_and_white : Controls whether to save output documents in black and white format (to make it look like a photocopy) . Accepted values are "yes" or "no". The default value is "no".

    • Example: -b yes or --black_and_white no
  • -l, --blur : Controls whether to make the output a little bit blurry. Accepted values are "yes" or "no". The default value is "no".

    • Example: -l yes or --blur no
  • -v, --variation : Controls whether to apply a variable blur effect (depth of field simulation) to the image. This simulates a scanner lid that wasn't closed perfectly flat, causing one part of the document to be slightly out of focus. Accepted values are "yes" or "no". The default value is "no".

    • Example: -v yes or --variation no
  • -n, --noise : Controls the amount of salt-and-pepper noise added to the image to simulate dust or scanner sensor imperfections. The value must be an integer between 0 and 100. A value of 0 means no noise, while 50 is significantly noisy. The default value is 0.

    • Example: -n 10 or --noise 50
  • -c, --contrast : Controls contrast factor of the image. A factor of 0.0 gives a solid gray image. A factor of 1.0 gives the original image. Greater values increase the contrast of the image. The default value is 1.

    • Example: -c 2
  • -sh, --sharpness : Controls sharpness factor of the image. A factor of 0.0 gives a blurred image. A factor of 1.0 gives the original image. Greater values increase the sharpness of the image. The default value is 1.

    • Example: -sh 2
  • -br, --brightness : Controls brightness factor of the image. A factor of 0.0 gives a black image. A factor of 1.0 gives the original image. Greater values increase the brightness of the image. The default value is 1.

    • Example: -br 2
  • -r, --recurse : Allows scripts to find all matching files including subdirectories. Accepted values are "yes" or "no". The default value is "yes".

    • Example: -r yes or --recurse no
  • -s, --sort_by : Allows scripts to sort the files based on name, creation time or modified time. Accepted values are "name", "ctime", "mtime", "none". The default value is "name". If "none" is selected, then the default order of files returned by the OS is used for document conversion.

    • Example: -s name or --sort_by none
  • -p, --password : Password for decrypting locked PDF files. By default, if omitted, the script will pause and prompt you for a password whenever it encounters a locked file. Use this flag if all your PDF files share the same password. If files have different passwords, omit this and enter them one-by-one when prompted.

    • Example: -p p@ss123 or --password p@ss123

❗❗ Note: ❗❗

  • The supported file types are: ".pdf", .jpg", ".jpeg", ".png", ".webp", ".tiff", ".tif", ".jp2", ".bmp"

  • The output PDF file size will be bigger than the input file because the pages are stored in image format.

  • Bookmarks / Links / Metadata will be removed when saving the output file.

  • Transparency will be removed from png files when converting to pdf.

  • Password protected PDF files are also supported since v1.1.

  • Youtube: How to Insert a Signature on a PDF File

License

MIT license

Authors

Testing

Run tests with detailed output:

# Run all tests
poetry run pytest -v

# Run specific tests
poetry run pytest -k="cli"

Support This Project

Paypal Badge BuymeCoffee Badge Ko-Fi Badge

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

look_like_scanned-1.1.0.tar.gz (23.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

look_like_scanned-1.1.0-py3-none-any.whl (18.1 kB view details)

Uploaded Python 3

File details

Details for the file look_like_scanned-1.1.0.tar.gz.

File metadata

  • Download URL: look_like_scanned-1.1.0.tar.gz
  • Upload date:
  • Size: 23.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for look_like_scanned-1.1.0.tar.gz
Algorithm Hash digest
SHA256 1030ab43b54ee1c52b6a14d93185c6184227050703f96e87c7e10c586ca9c134
MD5 528fd743dc35399ed8c180bae1735a7d
BLAKE2b-256 89777eded9241f9ff28caf7b89bb25b982adc87b512859978d5ec6efda2c6725

See more details on using hashes here.

File details

Details for the file look_like_scanned-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for look_like_scanned-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9a59c91e72331862f3591b07c43bd9f6db1567a591c0fd9b42a613ed281dc33a
MD5 9bd66db6629c8b568583478e30d7e761
BLAKE2b-256 c460ac633e7c48eeedfcd3d306882092c9cfde4005e79de5da2de665258ea2de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page