Skip to main content

Python package undouble

Project description

undouble

Python PyPI Version License Github Forks GitHub Open Issues Project Status Sphinx Downloads Downloads Sphinx

The aim of undouble is to detect (near-)identical images. It works using a multi-step process of pre-processing the images (grayscaling, normalizing, and scaling), computing the image hash, and the grouping of images. A threshold of 0 will group images with an identical image hash. The results can easily be explored by the plotting functionality and images can be moved with the move functionality. When moving images, the image in the group with the largest resolution will be copied, and all other images are moved to the undouble subdirectory. In case you want to cluster your images, I would recommend reading the blog and use the clustimage library.

The following steps are taken in the undouble library:

  • Read recursively all images from directory with the specified extensions.
  • Compute image hash.
  • Group similar images.
  • Move if desired.

⭐️ Star this repo if you like it ⭐️

Blogs

Documentation pages

On the documentation pages you can find detailed information about the working of the undouble with many examples.

Installation

It is advisable to create a new environment (e.g. with Conda).
conda create -n env_undouble python=3.8
conda activate env_undouble
Install bnlearn from PyPI
pip install undouble            # new install
pip install -U undouble         # update to latest version
Directly install from github source
pip install git+https://github.com/erdogant/undouble
Import Undouble package
from undouble import Undouble

Examples:

Example: Grouping similar images of the flower dataset

Example: List all file names that are identifical

Example: Moving similar images in the flower dataset
# -------------------------------------------------
# >You are at the point of physically moving files.
# -------------------------------------------------
# >[7] similar images are detected over [3] groups.
# >[4] images will be moved to the [undouble] subdirectory.
# >[3] images will be copied to the [undouble] subdirectory.

# >[C]ontinue moving all files.
# >[W]ait in each directory.
# >[Q]uit
# >Answer: w

Example: Plot the image hashes

Example: Three different imports

The input can be the following three types:

* Path to directory
* List of file locations
* Numpy array containing images

Example: Finding identical mnist digits


Citation

Please cite in your publications if this is useful for your research (see citation).

Maintainers

Contribute

  • All kinds of contributions are welcome!
  • If you wish to buy me a Coffee for this work, it is very appreciated :)

Licence

See LICENSE for details.

Other interesting stuf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

undouble-1.2.10.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

undouble-1.2.10-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file undouble-1.2.10.tar.gz.

File metadata

  • Download URL: undouble-1.2.10.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.27.1 setuptools/58.0.4 requests-toolbelt/0.9.1 tqdm/4.64.0 CPython/3.8.5

File hashes

Hashes for undouble-1.2.10.tar.gz
Algorithm Hash digest
SHA256 0c8099bddb3bce86d36bd383d9d50f64c8a50f850d779399790479da6d7fd052
MD5 a34cf8b3fab7f1eed25dce9a0673d145
BLAKE2b-256 7a54b3fa8480d0c4df748d724af7cbd6526c013a3720e62add66159c6dd35e49

See more details on using hashes here.

File details

Details for the file undouble-1.2.10-py3-none-any.whl.

File metadata

  • Download URL: undouble-1.2.10-py3-none-any.whl
  • Upload date:
  • Size: 17.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.27.1 setuptools/58.0.4 requests-toolbelt/0.9.1 tqdm/4.64.0 CPython/3.8.5

File hashes

Hashes for undouble-1.2.10-py3-none-any.whl
Algorithm Hash digest
SHA256 dd4fb75fe7fd2a6c784012693a9d7a50c177920c1d0155daf01602ea71478571
MD5 18e400f3a43b6d28afd6d4b8055177ca
BLAKE2b-256 8eec7263cf47cba7eb88ff40a3192c5a37bc3d6528cdf7b4c48ad61eed325fd0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page