Skip to main content

Very fast two folder image duplicate finder programmed with pickle and cv2

Project description

imgdups

Pylint

Most image duplicate checkers can find duplicates within a single folder. This solution can verify that no duplicates from one path (search) exists in another path (target). It will use opencv to create image descriptors and cache them into a pickle file for faster processing after it was run the first time. With this approach we can not just find exact duplicates but similar images based on a match score.

Requirements

Python 3.6+ was tested

sudo apt install python3 python3-pip

Option 1: Install from Source

git clone https://github.com/ChuckNorrison/imgdups
cd imgdups
pip3 install .

Option 2: Install from PyPi (recommended)

pip3 install imgdups

CLI Usage

imgdups --search "/path/to/reference" --target "/path/to/check"

or if not installed (git clone first)

cd imgdups
python3 imgdups.py --search "/path/to/reference" --target "/path/to/check"

Python example

#!/usr/bin/env python3
import imgdups

SEARCH_PATH = "/path/to/reference"
TARGET_PATH = "/path/to/check"

img_dups = imgdups.ImgDups(TARGET_PATH, SEARCH_PATH)
duplicates = img_dups.find_duplicates()

for duplicate in duplicates:
    print("%s == %s (score: %d)",
            duplicate["target"],
            duplicate["search"],
            duplicate["score"]
    )

print("%d duplicates found", len(duplicates))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imgdups-0.1.4.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

imgdups-0.1.4-py3-none-any.whl (18.2 kB view details)

Uploaded Python 3

File details

Details for the file imgdups-0.1.4.tar.gz.

File metadata

  • Download URL: imgdups-0.1.4.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for imgdups-0.1.4.tar.gz
Algorithm Hash digest
SHA256 fca5638a0a77c1f0fea6f777c7ae765fe0f39ffb1e9a11877485ddc24b4eca9d
MD5 66bdbcbf4840c22d1e7bd3a4de206f56
BLAKE2b-256 5d0eaa105b5ba697e5a2ee4162cd827e1890c802c704b876ad75c22830b00e30

See more details on using hashes here.

File details

Details for the file imgdups-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: imgdups-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 18.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for imgdups-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e73c33a545e99e0855e27fcbdc532ded02965c6cb7672b78bf27178040eb2a46
MD5 6ad3754279c0bf20d5cca4f1dab56c28
BLAKE2b-256 3fe1c55b07969e1c487d6492683cc79a944bb8edb3501eb202cd4910b5d1c971

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page