Skip to main content

Performance-first perceptual hashing library; perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets

Project description

imgdd pypi imgdd crate imgddcore crate codecov Documentation Status DeepSource

imgdd: Image DeDuplication

imgdd is a performance-first perceptual hashing library that combines Rust's speed with Python's accessibility, making it perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets.

Features

  • Multiple Hashing Algorithms: Supports aHash, dHash, mHash, pHash, wHash.
  • Multiple Filter Types: Supports Nearest, Triangle, CatmullRom, Gaussian, Lanczos3.
  • Identify Duplicates: Quickly identify duplicate hash pairs.
  • Simplicity: Simple interface, robust performance.

Why imgdd?

imgdd has been inspired by imagehash and aims to be a lightning-fast replacement with additional features. To ensure enhanced performance, imgdd has been benchmarked against imagehash. In Python, imgdd consistently outperforms imagehash by ~60%–95%, demonstrating a significant reduction in hashing time per image.


Quick Start

Installation

pip install imgdd

Usage Examples

Hash Images

import imgdd as dd

results = dd.hash(
    path="path/to/images",
    algo="dhash",  # Optional: default = dhash
    filter="triangle",  # Optional: default = triangle
    sort=False # Optional: default = False
)
print(results)

Find Duplicates

import imgdd as dd

duplicates = dd.dupes(
    path="path/to/images",
    algo="dhash", # Optional: default = dhash
    filter="triangle", # Optional: default = triangle
    remove=False # Optional: default = False
)
print(duplicates)

Supported Algorithms

  • aHash: Average Hash
  • mHash: Median Hash
  • dHash: Difference Hash
  • pHash: Perceptual Hash
  • wHash: Wavelet Hash

Supported Filters

  • Nearest, Triangle, CatmullRom, Gaussian, Lanczos3

Contributing

Contributions are always welcome! 🚀

Found a bug or have a question? Open a GitHub issue. Pull requests for new features or fixes are encouraged!

Similar projects

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imgdd-0.1.3.tar.gz (55.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imgdd-0.1.3-cp39-abi3-win_amd64.whl (1.2 MB view details)

Uploaded CPython 3.9+Windows x86-64

File details

Details for the file imgdd-0.1.3.tar.gz.

File metadata

  • Download URL: imgdd-0.1.3.tar.gz
  • Upload date:
  • Size: 55.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for imgdd-0.1.3.tar.gz
Algorithm Hash digest
SHA256 d4fecca487c1e623587aeebe6bd10fc6ffccba2ec783f00a87345bf065f9cbac
MD5 a855fe60e64946ab5dc6625310c432bf
BLAKE2b-256 e440bc20384d1d89ce98649eddae73ff46614b81253e5d522e57e836d7937641

See more details on using hashes here.

Provenance

The following attestation bundles were made for imgdd-0.1.3.tar.gz:

Publisher: release-python.yml on aastopher/imgdd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file imgdd-0.1.3-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: imgdd-0.1.3-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for imgdd-0.1.3-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 68e1e2d65c136210d956e4324b198e409181562312f2a702093e68e579541ccf
MD5 866fb7d55a909fd0d6d3b422e8fd3635
BLAKE2b-256 65744fba9fc00c407142fda2a8ba35deb0d5d4edb456d911d5cab9823985bb82

See more details on using hashes here.

Provenance

The following attestation bundles were made for imgdd-0.1.3-cp39-abi3-win_amd64.whl:

Publisher: release-python.yml on aastopher/imgdd

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page