Skip to main content

Utility to find similar files based on filename or hash.

Project description

Fiddup

Version 3.0.0 MIT License Flake8 Tests Stable Build

File DeDuplicator

Small tool to quickly scan a directory for files of similar names. Useful to scan through archives of books, documents, downloads, movies, music, ...

Two modes are available: Assistant (name based comparison), and Hash mode (hash comparison).

Fiddup is non-destructive. It will report similarities and duplicates, but it will not remove them.

In order to keep things performant and memory-limited, hashmode only hashes parts of both files. In case of false positives, first try to increase the --chunk_count flag. (default=5)

Installation

From PyPi

pip3 install fiddup

From Sauce

  • git pull https://github.com/jarviscodes/fiddup

  • setup.py install

Usage

(env) E:\Users\Jarvis\PycharmProjects\fiddup>python -m fiddup --help
Usage: python -m fiddup [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  assistant
  hashmode
Fiddup v3.0.0
Usage: python -m fiddup assistant [OPTIONS]

Options:
  -i, --in_path TEXT     Path to scan for duplicates.  [required]
  -t, --threshold FLOAT  Similarity threshold. Assistant will only show
                         similarities > this.
  -e, --extensions TEXT  List of extensions to scan for. Specify multiple with
                         e.g.: -e zip -e txt -e pdf.  [required]
  -d, --directory        Include directories in comparison. Only available in
                         assistant mode.
  -v, --verbose          Show verbose output.
  --help                 Show this message and exit.
Fiddup v3.0.0
Usage: python -m fiddup hashmode [OPTIONS]

Options:
  -i, --in_path TEXT     Path to scan for duplicates.  [required]
  -e, --extensions TEXT  List of extensions to scan for. Specify multiple with
                         e.g.: -e zip -e txt -e pdf.  [required]
  -v, --verbose          Show verbose output.
  --chunk_count INTEGER  Number of chunks to read from files while hashing.
                         Higher = more accuracy = Slower.
  --help                 Show this message and exit.

Assistant

Outputs a filename1, filename2, name similarity table. Useful when sorting out things manually on name base.

Hashmode

Get the hashes from the files and compare the files content-wise by doing so.

Testing

python -m unittest discover -s tests

or

python -m pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fiddup-3.0.0.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

fiddup-3.0.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file fiddup-3.0.0.tar.gz.

File metadata

  • Download URL: fiddup-3.0.0.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for fiddup-3.0.0.tar.gz
Algorithm Hash digest
SHA256 1f2cf4f18b4786aefd088bf38ec99b760a4682b782169fe1026d70607c313eca
MD5 0dd7dcf033a92afc09f0c6cf8bfb58a6
BLAKE2b-256 15e6cf1e32dcd053c89213c4cca16c26d96b2f86246d90ddd8962230f18d008d

See more details on using hashes here.

File details

Details for the file fiddup-3.0.0-py3-none-any.whl.

File metadata

  • Download URL: fiddup-3.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.9

File hashes

Hashes for fiddup-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ae541006e15dae2d01d0867d1c9cdc8ffe95f817b47ea51914c1b2b28b87867
MD5 ce68439c50087164747da5079b30b2bb
BLAKE2b-256 cc825e413bdf9ede993f9594842c3fb984148d630da9d2ebf176cff800be2087

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page