Skip to main content

duplicate image finder helps you find duplicate or similar images as well as delete them.

Project description

Duplicate Image Finder

Duplicate image finder uses image hashing to find similar/duplicate images in your local storage. All you gotta do is

  1. install,
  2. run (will setup the database with table) if no configuration is provided,
  3. run specifying which directory to look for images, and finally
  4. run asking it to show duplicate/similar images.

Please note that it is a prototype. Please use at your own discretion.

For example:

# 1. installing
python3.9 -m pip install --user duplicate-image-finder

# 2. show help
duplicate-image-finder --help

# 3. add directory images and calculate hashes using 4 threads
duplicate-image-finder --add <directory> --parallel 4

# 4. show the duplicate/similar images found in your browser
duplicate-image-finder --show

Running 4 will result in opening a browser that shows duplicate/similar images. If you click on delete, it will be moved to .Trash folder.

Requirements

Lots, but all of them can be installed as dependencies as long as you are using python3.9. Unfortunately, some of its dependencies have not been made available in python3.10 yet, so we are stuck there.

Poetry

Installing dependencies

poetry install

Running

poetry run python duplicate_image_finder/duplicate_finder.py --show

Testing

poetry run pytest

etc.

This duplicate image finder source code is inspired/partially copied from https://github.com/philipbl/duplicate-images.git.

Significant changes from the referred version are:

  1. moved from mongodb to sqlite
  2. Is probably better in terms of finding similar images (or perhaps I misunderstood the previous code)

Concepts/Technologies I learned/tried to learn while doing this:

  1. poetry for dependency
  2. pytest for unit test
  3. pysqlite3 for database
  4. concurrency for performance
  5. imagehash for perpetual image hashing for finding similarity
  6. grouping CLI arguments in python (mutually exclusive, etc) using argparser

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duplicate_image_finder-0.2.10.tar.gz (132.2 kB view details)

Uploaded Source

Built Distribution

duplicate_image_finder-0.2.10-py3-none-any.whl (133.7 kB view details)

Uploaded Python 3

File details

Details for the file duplicate_image_finder-0.2.10.tar.gz.

File metadata

  • Download URL: duplicate_image_finder-0.2.10.tar.gz
  • Upload date:
  • Size: 132.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.10.4 Darwin/21.5.0

File hashes

Hashes for duplicate_image_finder-0.2.10.tar.gz
Algorithm Hash digest
SHA256 063538559bbe4724ac51847150c0e819860cebe5e84bf088adc5baaa198dffe2
MD5 0c91b354fb7079d96d5e66954943d3f8
BLAKE2b-256 9a543cdc9f03ba5a8f998d7a4a010fc286ea96d6d2917b2a195b406b231dab18

See more details on using hashes here.

File details

Details for the file duplicate_image_finder-0.2.10-py3-none-any.whl.

File metadata

File hashes

Hashes for duplicate_image_finder-0.2.10-py3-none-any.whl
Algorithm Hash digest
SHA256 1c42b1673e484d6c9ed2b14d5433f102b18e7ad3db1bb34417f248fddfabe2de
MD5 ffdf5155ee83f21b6f4058985910e97d
BLAKE2b-256 abf0b82b3b8f4716721a1d47420c2a4402a992eeb61c0fc5838b051ae775f46e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page