Image deduplicator using CNN, Cosine Similarity, Image Hashing, Structural Similarity Index Measurement, and Euclidean Distance
Project description
Antidupe
Image deduplicator using CNN, Cosine Similarity, Image Hashing, Structural Similarity Index Measurement, and Euclidean Distance
Installation
You can install Antidupe using pip:
pip install antidupe
Usage
Basic Usage
from antidupe import Antidupe
from PIL import Image
# Initialize Antidupe
antidupe = Antidupe()
# Load images (as numpy arrays or PIL.Image objects)
image1 = Image.open("image1.jpg")
image2 = Image.open("image2.jpg")
# Check for duplicates
is_duplicate = antidupe.predict([image1, image2])
if is_duplicate:
print("Duplicate images detected!")
else:
print("Images are not duplicates.")
Customizing Thresholds
You can customize the similarity thresholds for each technique during runtime or initialization:
# Initialize Antidupe with custom thresholds
custom_thresholds = {
'ih': 0.2, # Image Hash
'ssim': 0.2, # SSIM
'cs': 0.2, # Cosine Similarity
'cnn': 0.2, # CNN
'dedup': 0.85 # Mobilenet
}
antidupe = Antidupe(limits=custom_thresholds)
# Check for duplicates
is_duplicate = antidupe.predict([image1, image2])
Debugging
You can enable debug mode to print debugging messages:
# Initialize Antidupe with debug mode enabled
antidupe = Antidupe(debug=True)
# Check for duplicates
is_duplicate = antidupe.predict([image1, image2])
Changing Limits During Runtime
You can change the similarity thresholds during runtime:
# Set new limits during runtime
new_thresholds = {
'ih': 0.1,
'ssim': 0.1,
'cs': 0.1,
'cnn': 0.1,
'dedup': 0.8
}
antidupe.set_limits(limits=new_thresholds)
# Check for duplicates
is_duplicate = antidupe.predict([image1, image2])
Requirements
- Python 3.x
- SSIM PIL
- ImageDeDup
- NumPy
- MatPlotLib
- Pillow
- ImageHash
- Torch
- Efficientnet Pytorch
- TorchVision
License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file antidupe-0.0.7.tar.gz
.
File metadata
- Download URL: antidupe-0.0.7.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0fa1afc4a4d337a52160d9d6fb813434252485affed7e4ec8135c65724176c54 |
|
MD5 | 02a3c69350ae157fcc65cbb21a4c28f4 |
|
BLAKE2b-256 | 6768d12ed5ca9cc22e9609aac3053d4d73fe6672e74fad42e3d0e5bd3b7ad63f |
File details
Details for the file antidupe-0.0.7-py3-none-any.whl
.
File metadata
- Download URL: antidupe-0.0.7-py3-none-any.whl
- Upload date:
- Size: 10.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2955801b214fbe2214718444abbfeb76eaee415219a56a2d9ee60f7b75ee57cf |
|
MD5 | cc357788c6846b85fa84a24ec0cbca99 |
|
BLAKE2b-256 | 35bf344271c2570f5622c2af8df2e146fc37c9ff548efacfcb0ceba1274a25fa |