Skip to main content

Image deduplicator using CNN, Cosine Similarity, Image Hashing, Structural Similarity Index Measurement, and Euclidean Distance

Project description


Antidupe

Image deduplicator using CNN, Cosine Similarity, Image Hashing, Structural Similarity Index Measurement, and Euclidean Distance

Installation

You can install Antidupe using pip:

pip install antidupe

Usage

Basic Usage

from antidupe import Antidupe
from PIL import Image

# Initialize Antidupe
antidupe = Antidupe()

# Load images (as numpy arrays or PIL.Image objects)
image1 = Image.open("image1.jpg")
image2 = Image.open("image2.jpg")

# Check for duplicates
is_duplicate = antidupe.predict([image1, image2])

if is_duplicate:
    print("Duplicate images detected!")
else:
    print("Images are not duplicates.")

Customizing Thresholds

You can customize the similarity thresholds for each technique during runtime or initialization:

# Initialize Antidupe with custom thresholds
custom_thresholds = {
    'ih': 0.2,   # Image Hash
    'ssim': 0.2, # SSIM
    'cs': 0.2,   # Cosine Similarity
    'cnn': 0.2   # CNN
}
antidupe = Antidupe(limits=custom_thresholds)

# Check for duplicates
is_duplicate = antidupe.predict([image1, image2])

Debugging

You can enable debug mode to print debugging messages:

# Initialize Antidupe with debug mode enabled
antidupe = Antidupe(debug=True)

# Check for duplicates
is_duplicate = antidupe.predict([image1, image2])

Changing Limits During Runtime

You can change the similarity thresholds during runtime:

# Set new limits during runtime
new_thresholds = {
    'ih': 0.1,
    'ssim': 0.1,
    'cs': 0.1,
    'cnn': 0.1
}
antidupe.set_limits(limits=new_thresholds)

# Check for duplicates
is_duplicate = antidupe.predict([image1, image2])

Requirements

  • Python 3.x
  • NumPy
  • Pillow
  • imagehash
  • torch
  • efficientnet_pytorch
  • torchvision

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antidupe-0.0.2.tar.gz (9.2 kB view details)

Uploaded Source

Built Distribution

antidupe-0.0.2-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file antidupe-0.0.2.tar.gz.

File metadata

  • Download URL: antidupe-0.0.2.tar.gz
  • Upload date:
  • Size: 9.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for antidupe-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8128ae4496e74239559df4eb3afaa649c6fc2e99b0a6fbf18ea6399f80f5d1b5
MD5 c852308267304c5ba60920e6c364ca21
BLAKE2b-256 c7d9bb20d7d26d6185762a15ec6ad7bcbfc21db41c0f16bbfbf8e98712b4a92b

See more details on using hashes here.

File details

Details for the file antidupe-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: antidupe-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for antidupe-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 98d643541c32760aa3c0364f2d4e0250f418dfb36d129a8a116c4846ce60cce8
MD5 7f63f1843a6f33cfa63c99fbc0ad4b97
BLAKE2b-256 e1b0772a75916ec68fda983adba71e78dbeef3f2745cb69f5914cb181dea76a2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page