Skip to main content

Recursively scan one or more given directories for duplicate files.

Project description

yadupe

yadupe is yet another tool to find and remove duplicate files in a system. It will recursively read a source directories, looking for duplicate files. Two files considered as duplicates if they have same size and content, though they could have different names.

In search mode utility report duplicate files list.

In deduplicate mode utility move duplicate files into the given destination directory. Only one file among group of duplicates is kept in the source directory. Also, report file contained all paths for moved diplicates will be saved in the destination directory.

In the alternate mode utility move unique files into the given destination. Only one file among group of duplicates is moved into destination directory.

Prerequisites

Install

% pip install yadupe

If you don't have pip it's also easy to install: https://pip.pypa.io/en/stable/installing/

When yadupe installed it is availble on the CLI:

% yadupe -h

Usage

  1. Search and remove duplicate files in directories /home/user/source_a, /home/user/source_b. Found duplicates will be moved into /home/user/duplicates, as well as report regarding moved files. Empty subfolders in source_a and source_b removed also.
% yadupe /home/user/source_a /home/user/source_b -d -p -r /home/user/duplicates
  1. Search duplicates in directory /home/user/source_a and print duplicate list.
% yadupe /home/user/source_a
  1. Search and move unique files in directories /home/user/source_a, /home/user/source_b. Found uniques will be moved into /home/user/uniques, as well as report regarding moved files.
% yadupe /home/user/source_a /home/user/source_b -u -p -r /home/user/uniques
  1. There are couple examples of using yadupe package in Python applications in the examples directory.

Options

% yadupe -h

usage: yadupe [-h] [-d] [-u] [-p] [-r PATH] PATH [PATH ...]

Recursively scan one or more given directories for duplicate files. Found
duplicates list could be saved into report or printed out in console. Also,
duplicates could be moved into destination directory in safe way, preserving
it relative path. In this case file name is written in the report, as well as
new path for the file. If empty sub-directories turn up after duplicates
removal, the could be deleted as well.

positional arguments:
  PATH                  Source path to search duplicated files.

optional arguments:
  -h, --help            show this help message and exit
  -d, --deduplicate     Scan and remove mode. Duplicates will be moved into
                        given directory.
  -u, --unique          Scan and move mode. Unique files will be moved into
                        given directory.
  -p, --purge           Remove empty subdirs after duplicates or uniques move.
  -r PATH, --result PATH
                        Path to report dir (optional for default search mode)
                        OR directory to move duplicated files into.

Testing

To run unit tests in test directory first unzip test-data.zip archive inside test-data directory. It create required directory tree for tests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yadupe-1.1.0.tar.gz (13.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yadupe-1.1.0-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file yadupe-1.1.0.tar.gz.

File metadata

  • Download URL: yadupe-1.1.0.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.25.0 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.8.10

File hashes

Hashes for yadupe-1.1.0.tar.gz
Algorithm Hash digest
SHA256 6d436be33384b1c87ece08c0df3ee69ff6c0e0b43bd2533f436ce4a537c9d4f2
MD5 f1b67fe376b7fae0cbad5144bbc8b9ac
BLAKE2b-256 00b80d8a3bbed48e31e55ce23573b88a9da50723bdff5472bf1ea93d96dc4375

See more details on using hashes here.

File details

Details for the file yadupe-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: yadupe-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.25.0 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.8.10

File hashes

Hashes for yadupe-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 af93701d02f034daa09cab2f3080ef358943aedaa92cc5f4a9420be897bc2644
MD5 733b138f3acf0b91c0c3ccad76074dc7
BLAKE2b-256 74ca651f0a1153537f1fa3f9776f80a5b3abc27f3e862112ac80632075b7a2a4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page