Skip to main content

Collection of image tools for checking photos for blur and duplicates, and handle fake PNG transparency

Project description

imgtoolkit

A powerful Python toolkit for image processing, specializing in finding duplicate and blurry images. This tool helps you organize your image collections by identifying and managing duplicate images and detecting low-quality or blurry photos.

Features

  • Find Duplicate Images: Uses perceptual hashing (dhash) to identify visually identical images
  • Detect Blurry Images: Identifies and separates blurry or low-quality images
  • Remove Fake PNG Backgrounds: Converts fake transparent PNG images to true transparent PNGs
  • Multi-format Support: Works with various image formats (JPG, JPEG, PNG, BMP, TIFF)
  • Configurable: Supports JSON configuration files for customized settings
  • Parallel Processing: Uses multiprocessing for improved performance
  • Progress Tracking: Shows progress bars for long-running operations

Installation

pip install imgtoolkit

Upgrade to the Latest Version

pip install --upgrade imgtoolkit

Usage

Command Line Interface

# Show help
imgtoolkit --help

# Default run (no subcommand): find blurry images, then duplicates
imgtoolkit

# Find duplicate images only
imgtoolkit find-duplicates [--folder OUTPUT_FOLDER] [--prefix PREFIX] [--formats jpg png]

# Find blurry images only
imgtoolkit find-blur [--folder OUTPUT_FOLDER] [--threshold BLUR_THRESHOLD] [--formats jpg png]

# Remove duplicate prefix from images
imgtoolkit remove-duplicate-prefix FOLDER [--prefix PREFIX]

# Remove fake transparent background from PNG
imgtoolkit remove-fakepng-bg SOURCE_PNG DESTINATION_PNG

# Show version
imgtoolkit version

Using Configuration File

Create a JSON configuration file (e.g., config.json):

{
    "duplicate": {
        "folder": "duplicates/",
        "prefix": "DUP_",
        "formats": ["jpg", "png"]
    },
    "blur": {
        "folder": "blurry/",
        "threshold": 5.0,
        "formats": ["jpg", "png"]
    }
}

Then run with the config file:

imgtoolkit --config config.json find-duplicates
imgtoolkit --config config.json find-blur

Command Options

find-duplicates

  • --folder: Output folder for duplicate images (default: "duplicate/")
  • --prefix: Prefix for marking duplicate files (default: "DUPLICATED_")
  • --formats: List of image formats to process (default: jpg, jpeg, png)

find-blur

  • --folder: Output folder for blurry images (default: "blur/")
  • --threshold: Blur detection threshold (default: 5.0, lower = more blurry). The detector uses a frequency-domain sharpness score (high/low energy ratio × 1000); typical values are ~3–10.
  • --formats: List of image formats to process (default: jpg, jpeg, png)
  • During the scan, any JPEG files that are unreadable/truncated (e.g. triggering "Premature end of JPEG file") are moved to a broken/ subfolder.

remove-duplicate-prefix

  • folder: The folder containing marked duplicate images
  • --prefix: Prefix to remove from filenames (default: "DUPLICATED_")

remove-fakepng-bg

  • src: Source PNG file with fake transparent background
  • dst: Destination path for the fixed PNG file

Supported Image Formats

  • JPEG (.jpg, .jpeg)
  • PNG (.png)
  • BMP (.bmp)
  • TIFF (.tiff)

Requirements

  • Python 3.6+
  • OpenCV (cv2)
  • Pillow
  • dhash
  • numpy
  • alive-progress

Error Handling

The toolkit provides clear error messages for common issues:

  • Invalid image formats
  • Missing or inaccessible files
  • Processing errors
  • Invalid configuration

Development

To contribute to imgtoolkit:

  1. Clone the repository
  2. Install development dependencies:
pip install -e ".[dev]"
  1. Run tests:
pytest

Version History

v0.1.3 Resolved a bug of handling incomplete / corrupted image files

v0.1.2 Revamp the command structure, adding features of remove fake PNG backgrounds

License

MIT License - See LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imgtoolkit-0.1.7.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imgtoolkit-0.1.7-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file imgtoolkit-0.1.7.tar.gz.

File metadata

  • Download URL: imgtoolkit-0.1.7.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.16.3 cpython/3.13.5 HTTPX/0.28.1

File hashes

Hashes for imgtoolkit-0.1.7.tar.gz
Algorithm Hash digest
SHA256 193671c66ab1daf1f6d866a3d1bab7567322fb5d8d362f0e2e752c5b85682824
MD5 a204f5ffe60a3513ef7efddeff08c1a5
BLAKE2b-256 98ac1cb456d586d4a21b0c2f3f4a4e4a62f9c85cda8d51c588ab1d36329386be

See more details on using hashes here.

File details

Details for the file imgtoolkit-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: imgtoolkit-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 11.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Hatch/1.16.3 cpython/3.13.5 HTTPX/0.28.1

File hashes

Hashes for imgtoolkit-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 c5bd7d88c28a7249240e96fc9711509e490e3994aa4e16f861f557b38e6a6789
MD5 85fcb5bd9eabd4d0bffabc3f97647387
BLAKE2b-256 c5dd726fbe91dc4644180998da8b79b44d57872b796323f982946af101f96716

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page