Utility to find similar files based on filename or hash.
Project description
Fiddup
File DeDuplicator
Small tool to quickly scan a directory for files of similar names. Useful to scan through archives of books, documents, downloads, movies, music, ...
Two modes are available: Assistant (name based comparison), and Hash mode (hash comparison).
Fiddup is non-destructive. It will report similarities and duplicates, but it will not remove them.
In order to keep things performant and memory-limited, hashmode only hashes parts of both files.
In case of false positives, first try to increase the --chunk_count
flag. (default=5)
Installation
From PyPi
pip3 install fiddup
From Sauce
-
git pull https://github.com/jarviscodes/fiddup
-
setup.py install
Usage
Usage: python -m fiddup [OPTIONS]
Fiddup is a Non-destructive file deduplicator that can assist you to find
similar or duplicate files.
Options:
-i, --inpath TEXT Path to scan for duplicates. [required]
-a, --assistant Toggles Assistant mode (name similarity search).
-t, --threshold FLOAT Similarity threshold. Assistant will only show
similarities > this.
-e, --extensions TEXT List of extensions to scan for, specify multiple with
e.g.: -e zip -e txt -e pdf. [required]
-d, --directory Include directories in comparison. Only available in
assistant mode.
-v, --verbose Show verbose output.
-h, --hashmode Toggles hash mode (file hash comparison).
--chunk_count INTEGER Number of chunks to read from files while hashing.
Higher = more accuracy = Slower.
--help Show this message and exit.
Assistant
Outputs a filename1, filename2, name similarity table. Useful when sorting out things manually on name base.
Hashmode
Get the hashes from the files and compare the files content-wise by doing so.
Testing
python -m unittest discover -s tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fiddup-2.3.0.tar.gz
.
File metadata
- Download URL: fiddup-2.3.0.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45e42ff755f334125184e8d16233319e88cb4dcd38509f5ac3ecdffafca942bb |
|
MD5 | 0d6d75004a1aaa5878cb3092c56e9706 |
|
BLAKE2b-256 | c263f29d1aeed5df56bb5b3334114ca586ac6977d12e44adb854fc0ea76b8bbc |
File details
Details for the file fiddup-2.3.0-py3-none-any.whl
.
File metadata
- Download URL: fiddup-2.3.0-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c557c6a3b86f90e04e4a478d711995a2723c480d8ce30115711c0f292ac4100 |
|
MD5 | 5b0af08f79afdc00137c0429c018e248 |
|
BLAKE2b-256 | c26bbf7fa6658567e4f3b7b08cd944807e8b9e5c85df6347192068b019aed954 |