Skip to main content

Utility to find similar files based on filename or hash.

Project description

Fiddup

Version 2.2.0 MIT License Flake8 Tests

File DeDuplicator

Small tool to quickly scan a directory for files of similar names. Useful to scan through archives of books, documents, downloads, movies, music, ...

Two modes are available: Assistant (name based comparison), and Hash mode (hash comparison).

Fiddup is non-destructive. It will report similarities and duplicates, but it will not remove them.

In order to keep things performant and memory-limited, hashmode only hashes parts of both files. In case of false positives, first try to increase the --chunk_count flag. (default=5)

Installation

From PyPi

pip3 install fiddup

From Sauce

  • git pull https://github.com/jarviscodes/fiddup

  • setup.py install

Usage

Usage: python -m fiddup [OPTIONS]

  Fiddup is a Non-destructive file deduplicator that can assist you to find
  similar or duplicate files.

Options:
  -i, --inpath TEXT      Path to scan for duplicates.  [required]
  -a, --assistant        Toggles Assistant mode (name similarity search).
  -t, --threshold FLOAT  Similarity threshold. Assistant will only show
                         similarities > this.
  -e, --extensions TEXT  List of extensions to scan for, specify multiple with
                         e.g.: -e zip -e txt -e pdf.  [required]
  -d, --directory        Include directories in comparison. Only available in
                         assistant mode.
  -v, --verbose          Show verbose output.
  -h, --hashmode         Toggles hash mode (file hash comparison).
  --chunk_count INTEGER  Number of chunks to read from files while hashing.
                         Higher = more accuracy = Slower.
  --help                 Show this message and exit.

Assistant

Outputs a filename1, filename2, name similarity table. Useful when sorting out things manually on name base.

Hashmode

Get the hashes from the files and compare the files content-wise by doing so.

Testing

python -m unittest discover -s tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fiddup-2.3.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

fiddup-2.3.0-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file fiddup-2.3.0.tar.gz.

File metadata

  • Download URL: fiddup-2.3.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.9

File hashes

Hashes for fiddup-2.3.0.tar.gz
Algorithm Hash digest
SHA256 45e42ff755f334125184e8d16233319e88cb4dcd38509f5ac3ecdffafca942bb
MD5 0d6d75004a1aaa5878cb3092c56e9706
BLAKE2b-256 c263f29d1aeed5df56bb5b3334114ca586ac6977d12e44adb854fc0ea76b8bbc

See more details on using hashes here.

File details

Details for the file fiddup-2.3.0-py3-none-any.whl.

File metadata

  • Download URL: fiddup-2.3.0-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.9

File hashes

Hashes for fiddup-2.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8c557c6a3b86f90e04e4a478d711995a2723c480d8ce30115711c0f292ac4100
MD5 5b0af08f79afdc00137c0429c018e248
BLAKE2b-256 c26bbf7fa6658567e4f3b7b08cd944807e8b9e5c85df6347192068b019aed954

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page