Skip to main content

No project description provided

Project description

ddup

Compare images in two image lists and find dulplicate ones.

Install

pip install ddup

Usage

  • Use it in command line
ddup {--list1 img1 img2 img3| --path1 imglist_path} \
     [{--list2 img1 img2 img3| --path2 imglist_path}] \
     [--out output_dir] [--log]

Compare result will be sved in ddup_output.json

Example 1

Compare images in list file1 to those in list file2 and save results in the specified fodler.

ddup --path1 imglist1.txt --path2 imglist2.txt --out /mnt/Storage

Input for Ex1

  • path1

    imglist1.txt

/mnt/Storage/test1/000001.jpg
/mnt/Storage/test1/000002.jpg
  • path2

    imglist2.txt

/mnt/Storage/test2/000001.jpg
/mnt/Storage/test2/000002.jpg
/mnt/Storage/test2/000003.jpg
  • out
/mnt/Storage

Output for Ex1

  • hash1.hdf5

    Store the hashes of images in list1 in .hdf5 format.

  • hash2.hdf5

    Store the hashes of images in list2 in .hdf5 format.

  • ddup_output.json

    Store the comparision results in json format. Each image in list1 will correspond to one or more images in list2 if they are similar.

{
  "/mnt/Storage/test1/000001.jpg": ["/mnt/Storage/test2/000001.jpg"],
  "/mnt/Storage/test1/000002.jpg": ["/mnt/Storage/test2/000002.jpg", "/mnt/Storage/test2/000003.jpg"]
}

Example 2

Compare a list of images to themselves.

You can give a single list, or give two same lists.

ddup --list1 1.jpg 2.jpg 3.jpg 4.jpg
ddup --list1 1.jpg 2.jpg 3.jpg 4.jpg --list2 1.jpg 2.jpg 3.jpg 4.jpg

Input for Ex2

["1.jpg", "2.jpg", "3.jpg", "4.jpg"]

Output for Ex2

  • hash1.hdf5

    Store thephashes of images in list1 in .hdf5 format.

  • ddup_output.json

    Store the comparision results in json format.

    For self comparison, similar images will be orginized into groups with the first image in the group be the key and the whole group be the value.

{
  "1.jpg": ["1.jpg", "2.jpg", "3.jpg"]
}

Parameters

list1 and path1 are considered as input1.

One and only one of them must be provided.

list2 and path2 are considered as input2.

None or one of them can be provided.

If none of them are provided, input1 will be compared with itself.

If one of them is provided, input1 will be compared with input2.

  • list1

    Directly give paths of several images.

--list1 1.jpg img/2.jpg img2/3.jpg
  • path1

    Path of the first image list file.

    If there are many images to be compared with, an image list file can be provided instead.

    Image paths in the list should be Absolute path.

--path1 imglist1.txt
  • list2

    Same as list1

  • path2

    Same as path1

  • out[optional]

    To specify a folder to save the results files.

    The folder will be created if it does not exist.

    Default is folder ddup_output in current path.

  • --log[optional]

    With this option added, the program will print detail log for each thread and each dulplicate image pair. This may cause message flush on screen so it is recommended to pipe it in to a log file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddup-0.0.7.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

ddup-0.0.7-py2.py3-none-any.whl (8.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file ddup-0.0.7.tar.gz.

File metadata

  • Download URL: ddup-0.0.7.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for ddup-0.0.7.tar.gz
Algorithm Hash digest
SHA256 a0b985248f165e9999a58443f9fb73e3319220a14950a336f186cadda580d61e
MD5 fa6591b02f688ee84f0711663128e629
BLAKE2b-256 81ebd18d6cbd1f0145355498d6a793a8da92ab48d4537efa468917c6ac0ccedf

See more details on using hashes here.

File details

Details for the file ddup-0.0.7-py2.py3-none-any.whl.

File metadata

  • Download URL: ddup-0.0.7-py2.py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for ddup-0.0.7-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 e55fc9250ea68f12880bcdf9cf16172afef6d6c9eec12b230cc3d2e152bad3a2
MD5 38cdbe656071f8dafbab49849981677c
BLAKE2b-256 68cd9a3f11b1f047623bbe36f2876386065b09e8647508a2448bdd826003c9c3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page