Skip to main content

Similar Image Finder

Project description

Similar Image Finder (Simimg)

Simimg in action

Description

This is a python GUI for displaying pictures grouped according to similarity. The main aim of the program is to help identify groups of holiday snaps that resemble each-other and efficiently inspect those groups. It allows to easily keep only the best photos.

The program is not designed to identify the same but modified pictures (recompressed jpgs, cropped images or adapted colours, etc.). Although it can be used for this there are many and better solutions available.

Upon starting Simimg from the command line, by default it will load the pictures it finds in the startup directory and sub-directories into the GUI. These are settings that can be changed within the GUI by clicking on settings. In particular in the case you want to use the program by clicking on its icon, you may want to set an empty startup directory.

You can play with different options that take into account how similar two pictures are. These are the panels in the left section of the finder window. You can activate a condition by clicking on it name. The following options exist:

  • Some gradient metrics adapted from ImageHash (dhash). Basically these measure whether two images have similar patterns of brighter and darker regions.

  • I have also implemented measurements of how similar the colours are between two images, as well as between 5 regions (the four corners and the central part). The measurement in HSV (hue-saturation-value) supposedly reflects best how humans perceive image information.

  • You can further select the maximum allowed time-span between the moments the pictures were taken in order to be considered a match.

  • You can match on camera model. This means that to pictures are considered to be a match if they were taken with the same camera.

  • Finally you can match on image shape. You can choose:

    • portrait/landscape: width smaller/larger or equal to height

    • exact: width/height are identical

    • some percentage difference allowed

Some of the selection criteria have additional parameters that you can play with.

Each condition has a Must Match checkbox. If this is switched on, only those pairs that satisfy this condition are considered matches. Note that:

  1. Must Match has no effect if only one condition is active.

  2. If some condition(s) have Must Match set, other conditions without Must Match have no effect.

  3. When multiple conditions are active and no Must Match is set, two images are considered a pair if any of the conditions is satisfied.

The actual use is to be able to better drill down the list. For example it allows to show only those groups that have similar colours and are taken with the same camera by switch on Must Match for both conditions.

What matching groups are shown?

When the program starts, there are no active conditions and thumbnails of all files are shown in a grid sorted by filename.

Once some conditions are activated or changed the display will be updated.

For each picture that has some matches in the collection, the groups of matching thumbnails will be shown in a line. The only exception is a group that is already displayed in its entirety as a subgroup on another line.

Simimg does its best to maintain the sorting order of the displayed files according to filename. This is chosen for two reasons. 1) it limits the visual changes when modifying parameters or conditions. This helps to understand the impact of the modification. 2) Many times the filename of holiday pictures represents a natural sorting order; for example the serial photo-number or a prefix chosen to indicate where a picture was taken. Maintaining this order, means related pictures have more chance of being presented close together.

Note that completely identical files (exact copies of some image file) will not be shown twice. Instead one thumbnail will be shown with a green border around it.

Available functions

Thumbnail buttons

You can click on the Hide or Delete button below each image.

  • Hide will remove the thumbnail from the display but it will not delete the file from your hard-disk.

  • Delete will remove the file from the display and from your hard-disk.

  • Move will Move the file to the folder selected in the move list on the bottom left.

(De)selecting thumbnails

You can select thumbnails by clicking on them; its background will turn blue to indicate that it is selected.

Pressing the Control (Ctrl) key while clicking will select or deselect the entire line of thumbnails.

Pressing the Shift key while clicking will select all thumbnails between the current image and the last selected image.

Clicking in an empty area of the thumbnail display area deselects all images.

The little red check-mark button () in the toolbar area (top-left) also switches between selects all and unselect all thumbnails.

Pressing Ctrl+a toggles between selecting and unselecting all thumbnails.

Actions for selected thumbnails

The Play button () in the toolbar will show a window that allows to view the selected images in larger versions (Ctrl+v).

The Minus button () will hide all selected thumbnails (Ctrl+h)

The Red-X button () will delete all selected thumbnails (Ctrl+d)

The Two folder button () will move the selected thumbnails (Ctrl+m)

Photo organisation functions

Because the Finder window is also a great way to get an overview even without using the selection functions, I have implemented a very basic organisation option into it. These are represented by the Move folders.

Imagine you have 2 folders defined: "WebAlbum", "EditWithGimp". You peruse you photos, select and delete those that are poor, you select those that are nice but either need better framing or play a bit with the brightness. Active the "EditWithGimp" folder and press Move button (). Next, you have found a number of great pictures that you want to publish. Select those active the "WebAlbum" target and press Move.

Actions in the viewer window

One design goal is a clean interface with a lot of room for the pictures themselves. Therefore there are no action buttons in the viewer.

The follow actions are available in the viewer window:

  • F1 or i: show a short help window

  • arrow right or n: show the next picture

  • arrow left or p: show the previous picture

  • scrollwheel: zoom-in on part of the picture

  • delete or d: delete the picture from disk

  • m: move the file to the move-target directory selected in the finder window

  • 1: move the file to the first move-target directory

  • 2: move the file to the second move-target directory

  • 3: move the file to the third move-target directory

  • escape or q: quit the viewer

Tips

There are a few features that are not immediately obvious. Camera Model and Picture Shape can be set to "different". By themselves these options are not useful because they will show unrelated pictures together. They can become interesting in the following scenario:

Several people have taken pictures of the same scene, you select pictures taken close in time or with similar colours. If you impose different Camera Model you can concentrate on similar pictures but taken by different people.

The Folder select dialog for move does not allow to create folders on some platforms. Selecting the parent directoy and adding (by typing) the target folder you would like to create before pressing OK will create the directory.

Technical remarks

I have seen quite a variety of 'success', meaning that some algorithm detects matches that I myself would also call a match. It depends a lot on the set of images that one uses as input. I find it useful to play around a bit with selecting different algorithms and playing with the numerical limits. To help with this, the tooltip of the limit selectors will tell you at which value the first match happens and at which value more than 10 matches are found.

In my experience, for the purpose of detecting the most interesting similar holiday pictures the "Average" and "Perception" algorithms can be useful but the "HSV (5 regions)" in the Colours Conditions gives the best results.

The other conditions should be considered optional to further limit the shown matches.

Some of the calculations can be time-consuming and Simimg tries to be clever about not recalculating. It will store the calculated values in a database for future use. It recognises the pictures files by their MD5-hash which means that even if you move files or rename them, their image properties will not be recalculated.

It attempts to do the most expensive calculations in parallel making optimal use of the CPU capabilities.

Note that for reasons of speed, the maximum number of thumbnails that will be shown will not exceed about 300.

Note that for reasons of speed and memory, the maximum number of files that will be loaded when adding a folder is 900.

Credit

This project uses the following open source packages:

  • Python: version 3

  • tkinter that should normally come with your python

  • pillow for image reading and processing.

  • The tooltip code is adapted from an example found on Daniweb.

Some of the algorithms used have been inspired by code found at imagedupes, pyimagesearch and imageHash.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simimg-0.8.0.tar.gz (54.2 kB view details)

Uploaded Source

Built Distribution

simimg-0.8.0-py3-none-any.whl (56.1 kB view details)

Uploaded Python 3

File details

Details for the file simimg-0.8.0.tar.gz.

File metadata

  • Download URL: simimg-0.8.0.tar.gz
  • Upload date:
  • Size: 54.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.8.1

File hashes

Hashes for simimg-0.8.0.tar.gz
Algorithm Hash digest
SHA256 e94f0965d9ad2feb9bd6d8e18e9f5941388d6dd94e6df63c1eb7387a8d92d54f
MD5 1c1baad1dd1c5adb429635f60f95f8b8
BLAKE2b-256 01ae7dd8e683e5078622295cbea3049d6404db4386bf77b716f70329e71f3412

See more details on using hashes here.

File details

Details for the file simimg-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: simimg-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 56.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.8.1

File hashes

Hashes for simimg-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2e7affbcfa2c911ea2251fbebdcdf6d0366fe9cf80a7aaf8b7a374120c3cefe1
MD5 e0c23739c6944229b1b4c1ee4bdaf26a
BLAKE2b-256 ec350108539a333bedf6cd3b2419f844c8ab533c5919211622e87bfe4cb13925

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page