Measure visual similiarity of a reference image to other images.
Project description
imagesearch
imagesearch
measures visual similiarity between a reference image and a set of other
images. This can be used to search for a similar image in a large/deep directory structure.
Installation
pip install imagesearch
See imagesearch on PyPI.
Examples
-
Compare a reference image to all images in a search path:
> imagesearch needle.jpg haystack\ 28 haystack\0.jpg 38 haystack\1.jpg 12 haystack\2.jpg 18 haystack\3.jpg 32 haystack\4.jpg 29 haystack\5.jpg 0 haystack\6.jpg 29 haystack\7.jpg 5 haystack\8.jpg 28 haystack\9.jpg
In this example,
haystack\6.jpg
is most similar. -
Compare against a single image:
> imagesearch needle.jpg haystack\1.jpg 38 haystack\1.jpg
-
Only return images with similarity less than or equal to 10:
> imagesearch needle.jpg haystack\ --threshold 10 0 haystack\6.jpg 5 haystack\8.jpg
-
Return the first image found under the threshold (0, in this case) and stop searching immediately:
> imagesearch needle.jpg haystack\ -t 0 -1 0 haystack\6.jpg
-
Specify a different algorithm:
> imagesearch needle.jpg haystack\ --algorithm colorhash ...
-
Get more help:
> imagesearch --help ...
Visual Similiarity
imagesearch
returns a nonnegative integer that quantifies the visual similarity between the
reference image and another image. It does this by creating an image fingerprint and looking at the
difference between them.
A critical feature of these fingerprints is that they can be numerically compared (by Hamming Distance). Images that are different will have large differences in their fingerprints, and vice versa
A 0
value indicates the highest level of similarity, or possibly a true match.
Unless you have a good understanding of the algorihms used, values should be treated as opaque and subjective. It is dependent on the algorithm used to create the fingerprints and your criteria for what "similar" is.
This project uses the imagehash library to produce these fingerprints, and more information about the techniques can be found there.
Algorithms
All the fingerprinting algorithms in imagesearch
come from imagehash. In imagesearch
, you may specify which algorithm
to use by passing the appropriate option value to the -a
or --algorithm
flag:
ahash
: Average hashing (aHash)phash
: 2-axis perceptual hashing (pHash)phash-simple
: 1-axis perceptual hashing (pHash)dhash
: Horizontal difference hashing (dHash)dhash-vert
: Vertical difference hashing (dHash)whash-haar
: Haar wavelet hashing (wHash)whash-db4
: Daubechies wavelet hashing (wHash)colorhash
: HSV color hashing (colorhash)
Collisions
These algorithms trade away accuracy for speed and size, usually with acceptable results. Instead of producing an artifact that exactly identifies an image, there's analysis done on some more abstract quality of the image, such as it's luminance or signal frequency. This allows us to:
- do less processing
- get a fingerprint with a small size
- get a fingerprint that exists in a linear space for comparison
However, because the exact image analysis is abstract and produces a fixed-size fingerprint, it's absolutely possible for 2 different images to have the same fingerprint. This is sort of an analog to cryptographic hash collosions, so it's important to understand what kinds of scenarios may cause this!
See this section of the imagehash documentation for examples of different images that produce thesame fingerprint. The source code of that project also references other pages that explain the qualities of the algorithm.
Contributing
Bug Fixes/Features
Submit a PR from an appropriately named feature branch off of master.
Releasing
- Bump the version with
bumpversion [patch|minor|major]
. This will update the version number around the project, commit and tag it. - Push the repo. A Github release will be made and published to PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for imagesearch-0.1.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d031b9c178a9cd141ecdf9ea71aa6134267812093f8401098f2d3db204ff3f9 |
|
MD5 | e73c44aead75aaccb24c224abc502db7 |
|
BLAKE2b-256 | 9efe85d949787feda40434c28627809c75ff31889ff195a364421b2e63b096db |