Collection of image tools for checking photos for blur and duplicates, and handle fake PNG transparency
Project description
imgtoolkit
A powerful Python toolkit for image processing, specializing in finding duplicate and blurry images. This tool helps you organize your image collections by identifying and managing duplicate images and detecting low-quality or blurry photos.
Features
- Find Duplicate Images: Uses perceptual hashing (dhash) to identify visually identical images
- Detect Blurry Images: Identifies and separates blurry or low-quality images
- Remove Fake PNG Backgrounds: Converts fake transparent PNG images to true transparent PNGs
- Multi-format Support: Works with various image formats (JPG, JPEG, PNG, BMP, TIFF)
- Configurable: Supports JSON configuration files for customized settings
- Parallel Processing: Uses multiprocessing for improved performance
- Progress Tracking: Shows progress bars for long-running operations
Installation
pip install imgtoolkit
Upgrade to the Latest Version
pip install --upgrade imgtoolkit
Usage
Command Line Interface
# Show help
imgtoolkit --help
# Default run (no subcommand): find blurry images, then duplicates
imgtoolkit
# Find duplicate images only
imgtoolkit find-duplicates [--folder OUTPUT_FOLDER] [--prefix PREFIX] [--formats jpg png]
# Find blurry images only
imgtoolkit find-blur [--folder OUTPUT_FOLDER] [--threshold BLUR_THRESHOLD] [--formats jpg png]
# Remove duplicate prefix from images
imgtoolkit remove-duplicate-prefix FOLDER [--prefix PREFIX]
# Remove fake transparent background from PNG
imgtoolkit remove-fakepng-bg SOURCE_PNG DESTINATION_PNG
# Show version
imgtoolkit version
Using Configuration File
Create a JSON configuration file (e.g., config.json):
{
"duplicate": {
"folder": "duplicates/",
"prefix": "DUP_",
"formats": ["jpg", "png"]
},
"blur": {
"folder": "blurry/",
"threshold": 5.0,
"formats": ["jpg", "png"]
}
}
Then run with the config file:
imgtoolkit --config config.json find-duplicates
imgtoolkit --config config.json find-blur
Command Options
find-duplicates
--folder: Output folder for duplicate images (default: "duplicate/")--prefix: Prefix for marking duplicate files (default: "DUPLICATED_")--formats: List of image formats to process (default: jpg, jpeg, png)
find-blur
--folder: Output folder for blurry images (default: "blur/")--threshold: Blur detection threshold (default: 5.0, lower = more blurry). The detector combines a frequency-domain sharpness score with edge strength (Tenengrad) to reduce false positives on detailed images.--formats: List of image formats to process (default: jpg, jpeg, png)- During the scan, any JPEG files that are unreadable/truncated (e.g. triggering "Premature end of JPEG file") are moved to a
broken/subfolder.
Blur Detection Technology and Techniques
- Core stack: Implemented in Python using
OpenCV (cv2)for image processing andNumPyfor matrix/FFT operations. - Frequency-domain sharpness (FFT): The image is converted to grayscale, center-cropped, resized to
512x512, transformed by 2D FFT, then scored byhigh-frequency energy / low-frequency energy(scaled byx1000). Lower score means less fine detail, so more likely blur. - Edge-strength validation (Tenengrad): Sobel gradients are used to compute gradient energy (Tenengrad). This avoids false positives where FFT score is low but the image still has strong edges and visible detail.
- Final decision rule: An image is classified as blurry only when both signals are weak: low FFT score and low Tenengrad.
- Robust file handling: During
find-blur, zero-byte files are deleted immediately, and corrupted/truncated JPEGs are moved to thebroken/folder so the scan can continue.
remove-duplicate-prefix
folder: The folder containing marked duplicate images--prefix: Prefix to remove from filenames (default: "DUPLICATED_")
remove-fakepng-bg
src: Source PNG file with fake transparent backgrounddst: Destination path for the fixed PNG file
Supported Image Formats
- JPEG (.jpg, .jpeg)
- PNG (.png)
- BMP (.bmp)
- TIFF (.tiff)
Requirements
- Python 3.6+
- OpenCV (cv2)
- Pillow
- dhash
- numpy
- alive-progress
Error Handling
The toolkit provides clear error messages for common issues:
- Invalid image formats
- Missing or inaccessible files
- Processing errors
- Invalid configuration
Development
To contribute to imgtoolkit:
- Clone the repository
- Install development dependencies:
pip install -e ".[dev]"
- Run tests:
pytest
Version History
v0.1.8 Improved Find Blurry Image logic
v0.1.3 Resolved a bug of handling incomplete / corrupted image files
v0.1.2 Revamp the command structure, adding features of remove fake PNG backgrounds
License
MIT License - See LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imgtoolkit-0.1.8.tar.gz.
File metadata
- Download URL: imgtoolkit-0.1.8.tar.gz
- Upload date:
- Size: 11.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Hatch/1.16.3 cpython/3.13.5 HTTPX/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2b76119c4bde36178b12d9a3036233299311468598b94380226d2d65933cb6a9
|
|
| MD5 |
8bed72ddef52d120b1c5363b8757e325
|
|
| BLAKE2b-256 |
142c5e75bb215597cf139779033ca5c2fe7cbb6d19153d540ab3cb9a90956a72
|
File details
Details for the file imgtoolkit-0.1.8-py3-none-any.whl.
File metadata
- Download URL: imgtoolkit-0.1.8-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: Hatch/1.16.3 cpython/3.13.5 HTTPX/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ccb690c5acb7f416d33051b857dc8034f75c2e17e519f69cac6492fc97139f2
|
|
| MD5 |
88e2ff5ed2f13454e9d52b00ac12a552
|
|
| BLAKE2b-256 |
9c222bad2c45c11f9ed25b66ef37ed9ee066283bbf02e78342030245592abb5d
|