Detect duplicates in the Wagtail images library.
Project description
Wagtail Images De-duplicator
wagtail-images-deduplicator is a Wagtail app to detect duplicate images in the admin. It's built with imagehash.
Requirements
Wagtail Images De-duplicator works with wagtail>=3.0.
Installation
Use pip to install this package:
pip install wagtail-images-deduplicator
Configuration
-
Add
wagtail_images_deduplicatorto yourINSTALLED_APPSin your project's settings. -
Add the
DuplicateFindingMixinto your custom image model. An example of doing it is shown below:
from wagtail.images.models import Image, AbstractImage, AbstractRendition
from wagtail_images_deduplicator.models import DuplicateFindingMixin
class CustomImage(DuplicateFindingMixin, AbstractImage):
admin_form_fields = Image.admin_form_fields
class CustomRendition(AbstractRendition):
image = models.ForeignKey(
CustomImage, on_delete=models.CASCADE, related_name="renditions"
)
class Meta:
unique_together = (("image", "filter_spec", "focal_point_key"),)
If you choose to add the mixin and have existing image data, you will need to call save() on all existing instances to fill in the new hash value:
from wagtail.images import get_image_model
for image in get_image_model().objects.all():
image.save()
Settings
WAGTAILIMAGESDEDUPLICATOR_HASH_FUNC
This setting determines the hash function to use.
| Hash function | Reference | Setting name |
|---|---|---|
| Average hashing | http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html | average_hash |
| Perceptual hashing | http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html | phash (default) |
| Difference hashing | http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html | dhash or dhash_vertical |
| Wavelet hashing | https://fullstackml.com/2016/07/02/wavelet-image-hash-in-python/ | whash |
| HSV color hashing | colorhash |
|
| Crop-resistant hashing | https://ieeexplore.ieee.org/document/6980335 | crop_resistant_hash |
WAGTAILIMAGESDEDUPLICATOR_MAX_DISTANCE_THRESOLD
This setting determines the maximum distance between 2 images to consider them as duplicates.
The default value is 5.
To help you assess how these different algorithms behave and to learn more about hash distances, check out the examples section of the imagehash library's README.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wagtail-images-deduplicator-1.0a1.tar.gz.
File metadata
- Download URL: wagtail-images-deduplicator-1.0a1.tar.gz
- Upload date:
- Size: 7.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1fc949c3b7e3ac4fc096fde44c3b0e541063108748e55e4df454f8ede5856db4
|
|
| MD5 |
a707328438bfeb01de3763f2ce6c33c2
|
|
| BLAKE2b-256 |
0fee4189203b649062a884c7670b79564810e8f76f4b9470cfa2ee0b5e58a6dd
|
File details
Details for the file wagtail_images_deduplicator-1.0a1-py3-none-any.whl.
File metadata
- Download URL: wagtail_images_deduplicator-1.0a1-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aa6d0115eb2f911be45f9a0e33435f1b688c5718767a1741be8ca5a9a3bd2b8d
|
|
| MD5 |
086c39267b9fae963d87e1a199642c5a
|
|
| BLAKE2b-256 |
cd2afa16a831676c80d9fb6a4ea88366b98138c676193d01417e625c4615209f
|