Utility to find similar files based on filename or hash.
Project description
Fiddup
File DeDuplicator
Small tool to quickly scan a directory for files of similar names. Useful to scan through archives of books, documents, downloads, movies, music, ...
Two modes are available: Assistant (name based comparison), and Hash mode (hash comparison).
Fiddup is non-destructive. It will report similarities and duplicates, but it will not remove them.
In order to keep things performant and memory-limited, hashmode only hashes parts of both files.
In case of false positives, first try to increase the --chunk_count
flag. (default=5)
Installation
From PyPi
pip3 install fiddup
From Sauce
-
git pull https://github.com/jarviscodes/fiddup
-
setup.py install
Usage
(env) E:\Users\Jarvis\PycharmProjects\fiddup>python -m fiddup --help
Usage: python -m fiddup [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
assistant
hashmode
Fiddup v3.0.0
Usage: python -m fiddup assistant [OPTIONS]
Options:
-i, --in_path TEXT Path to scan for duplicates. [required]
-t, --threshold FLOAT Similarity threshold. Assistant will only show
similarities > this.
-e, --extensions TEXT List of extensions to scan for. Specify multiple with
e.g.: -e zip -e txt -e pdf. [required]
-d, --directory Include directories in comparison. Only available in
assistant mode.
-v, --verbose Show verbose output.
--help Show this message and exit.
Fiddup v3.0.0
Usage: python -m fiddup hashmode [OPTIONS]
Options:
-i, --in_path TEXT Path to scan for duplicates. [required]
-e, --extensions TEXT List of extensions to scan for. Specify multiple with
e.g.: -e zip -e txt -e pdf. [required]
-v, --verbose Show verbose output.
--chunk_count INTEGER Number of chunks to read from files while hashing.
Higher = more accuracy = Slower.
--help Show this message and exit.
Assistant
Outputs a filename1, filename2, name similarity table. Useful when sorting out things manually on name base.
Hashmode
Get the hashes from the files and compare the files content-wise by doing so.
Testing
python -m unittest discover -s tests
or
python -m pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fiddup-3.0.0.tar.gz
.
File metadata
- Download URL: fiddup-3.0.0.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1f2cf4f18b4786aefd088bf38ec99b760a4682b782169fe1026d70607c313eca |
|
MD5 | 0dd7dcf033a92afc09f0c6cf8bfb58a6 |
|
BLAKE2b-256 | 15e6cf1e32dcd053c89213c4cca16c26d96b2f86246d90ddd8962230f18d008d |
File details
Details for the file fiddup-3.0.0-py3-none-any.whl
.
File metadata
- Download URL: fiddup-3.0.0-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ae541006e15dae2d01d0867d1c9cdc8ffe95f817b47ea51914c1b2b28b87867 |
|
MD5 | ce68439c50087164747da5079b30b2bb |
|
BLAKE2b-256 | cc825e413bdf9ede993f9594842c3fb984148d630da9d2ebf176cff800be2087 |