Skip to main content

Performance-first perceptual hashing library; perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets

Project description

imgdd pypi imgdd crate imgddcore crate codecov Documentation Status DeepSource

imgdd: Image DeDuplication

imgdd is a performance-first perceptual hashing library that combines Rust's speed with Python's accessibility, making it perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets.

Features

  • Multiple Hashing Algorithms: Supports aHash, dHash, mHash, pHash, wHash.
  • Multiple Filter Types: Supports Nearest, Triangle, CatmullRom, Gaussian, Lanczos3.
  • Identify Duplicates: Quickly identify duplicate hash pairs.
  • Simplicity: Simple interface, robust performance.

Why imgdd?

imgdd has been inspired by imagehash and aims to be a lightning-fast replacement with additional features. To ensure enhanced performance, imgdd has been benchmarked against imagehash. In Python, imgdd consistently outperforms imagehash by ~60%–95%, demonstrating a significant reduction in hashing time per image.


Quick Start

Installation

pip install imgdd

Usage Examples

Hash Images

import imgdd as dd

results = dd.hash(
    path="path/to/images",
    algo="dhash",  # Optional: default = dhash
    filter="triangle",  # Optional: default = triangle
    sort=False # Optional: default = False
)
print(results)

Find Duplicates

import imgdd as dd

duplicates = dd.dupes(
    path="path/to/images",
    algo="dhash", # Optional: default = dhash
    filter="triangle", # Optional: default = triangle
    remove=False # Optional: default = False
)
print(duplicates)

Supported Algorithms

  • aHash: Average Hash
  • mHash: Median Hash
  • dHash: Difference Hash
  • pHash: Perceptual Hash
  • wHash: Wavelet Hash

Supported Filters

  • Nearest, Triangle, CatmullRom, Gaussian, Lanczos3

Contributing

Contributions are always welcome! 🚀

Found a bug or have a question? Open a GitHub issue. Pull requests for new features or fixes are encouraged!

Similar projects

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imgdd-0.1.4.tar.gz (55.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imgdd-0.1.4-cp39-abi3-macosx_11_0_arm64.whl (1.3 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file imgdd-0.1.4.tar.gz.

File metadata

  • Download URL: imgdd-0.1.4.tar.gz
  • Upload date:
  • Size: 55.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for imgdd-0.1.4.tar.gz
Algorithm Hash digest
SHA256 389115a3a5f589840716bba118664ece71d47d9caee5c9216e08a7267801d418
MD5 cbe36d1a1b997b7ce2064168b6b6f695
BLAKE2b-256 0f1b9de2ac120e7109007a66fd10fa0c0a03e3d85371e1b98653351e43cdecaa

See more details on using hashes here.

Provenance

The following attestation bundles were made for imgdd-0.1.4.tar.gz:

Publisher: release-python.yml on aastopher/imgdd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file imgdd-0.1.4-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

  • Download URL: imgdd-0.1.4-cp39-abi3-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: CPython 3.9+, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for imgdd-0.1.4-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 01451606fc3fce14752dd7964c49dcf3dd7ec2a55f195304d6d2bb51945350d8
MD5 4bc0a370bbca9234f855123f8161422d
BLAKE2b-256 664fa2d1ff664fc0e66827c91e9d7790f131e7fe41c9c618ccd841e00480d21b

See more details on using hashes here.

Provenance

The following attestation bundles were made for imgdd-0.1.4-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release-python.yml on aastopher/imgdd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page