Skip to main content

Duple is a CLI that finds and removes duplicate files.

Project description

Project Description

Duple is a small package that will find and remove duplicate files. I created duple only because there is no port of rmlint to Windows. I suggest using rmlint on unix/linux systems instead - it is far superior to duple.

Duple will iterate through all files and directories that is given and calculate a hash value, hash can be specified. Then, it groups all of those results into a dictionary with the hashes as the key. Dictionary entries with 2 or more values constitute a duplicate. The duplicate that will be kept is specified with an optional flag. To see the flags and their descriptions:

duple scan --help

Once the scan is complete, you can review the output file (duple.delete) and modify at your discretion. Once you have reviewed and modified, run the following command to send the duplicates to the trash:

duple rm

Installation

It is strongly recommended to use the latest version of duple.

pip install duple

or if you need to upgrade:

pip install duple --upgrade

You may need to add the Python Scripts folder on your computer to the PATH.

Windows

Open PowerShell (Start > [search for powershell]) and copy/paste the following text to the command line:

python3 -c "from duple.info import get_user_scripts_path
get_user_scripts_path()"

Go to Start > [search for 'edit environment variables for your account'] > Users Variables for [user name] > Select Path in top list box > Click Edit...

Once the window pops up, add to the bottom of the list the result from the PowerShell command above

Usage

duple has two primary sub-commands: scan and rm. Scan scans your system based on the arguments given to scan and reports those results in output files reported by duple scan.

An Example:

The command below will scan the currenty directory and calculate a hash for each file to determine if there are duplicates:

duple scan -d . 'sha256'
Argument Description
-d specifies the duplicate resolution behavior, in this case, duple will keep the duplicate with the lowest filesystem depth.
. specifies the current directory, to be scanned
'sha256' specifies the hash function to use when duple calculates hashes to determine if files are duplicates

Version History

1.0.0 Refactored and Improved Output and Reporting

-refactored code to be easier to follow and more modular
-improved reporting of results to duple.delete and duple.json
-improved duple.json output, adding additional data
-added dry run and verbose flags to duple rm
-added hash-stats to calculate performance times for each available hash
-added make-test-files to make test files for the user to learn how duple works on test data
-Improved README for better installation and setup instructions

0.5.0 Improve Data Outputs

-added dictionary to duple.json for file stats, now each entry has a key to describe the number
-fixed progress bar for pre-processing directories
-added output file duple.all_files.json with file statistics on all files within the specified path for 'duple scan'
-Improved summary statistics output for 'duple scan'

0.4.0 Performance Improvements

-adding multiprocessing, taking advantage of multiple cores
-eliminated files with unique sizes from analysis - files with unique size are not duplicates of another file

0.3.0 Added Capability

-added mv function that will move 'duple.delete' paths instead of deleting them

0.2.0 Added license

-Added license

0.1.1 Misc. Fixes

-Fixed typos in help strings
-Added support for sending duplicates to trash ('duple rm')

0.1.0 Initial Release

This is the initial release of duple

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duple-1.1.0.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

duple-1.1.0-py3-none-any.whl (21.7 kB view details)

Uploaded Python 3

File details

Details for the file duple-1.1.0.tar.gz.

File metadata

  • Download URL: duple-1.1.0.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.5.0

File hashes

Hashes for duple-1.1.0.tar.gz
Algorithm Hash digest
SHA256 f4ae89f48d6d4f590e7cc3aef2316df4bc2a516060f12881048e6fa3439d1560
MD5 f0ec97e3b0d14d01d16232e12adf79cb
BLAKE2b-256 3b7d4d0b58f08a4534958e8f3c71d1266e47338cbb328aacb192532bc6fa893f

See more details on using hashes here.

File details

Details for the file duple-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: duple-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.5 Darwin/23.5.0

File hashes

Hashes for duple-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 544f56f234c6995c833daa1195d5affd3d1a64d43250457344c949bcd80cad01
MD5 ca832eb7f62039d3b52ae2110af427d1
BLAKE2b-256 c3351a975edabfee75420850b5e5a2effe4d3f1e94edb44efd985de0da017432

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page