Identify duplicate files and optionally create hardlinks to save storage
Project description
Duplicates
Identify duplicate files and replace them with hardlinks on any OS.
Intended to be used to reduce the storage space taken up by mutliple copies of similar backups. (E.g. regular google takeouts)
Usage
Can be run from a command line in Linux, MacOS or Windows and will recursively scan a directory, identify and optionally hardlink any duplicate files found.
WARNING: Hardlinking files means if you change any one "copy" all "copies" will change.
WARNING: If other hardlinks are present outside the directories scanned, these may no longer point to the same inode as those within the scanned directories. Consider the situation as undefined.
Command line
dupes PATH
will display number of duplicate files found under PATH
dupes PATH1 PATH2 ...
will display number of duplicate files found under and across PATH1
and PATH2
dupes --list PATHS...
will list the full sets of duplicate files found
dupes --short PATHS...
will only list sets of duplicates where there are different file names
and finally ...
dupes --link PATHS...
will replace duplicate files with hard links
Python
You can also use the class DuplicateFiles
to indentify and optionally link duplicates.
Additionally BufferedIOFile
provides a binary file which knows its Path
and offers a readchunk()
method similar to the text file readline()
.
Up Next
- Keep original file mode after hardlinking
- Select leading inode for linking
- Improved exception handling from the command line
Please vote on any issues which are important to you.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file link-duplicates-1.0.0.tar.gz
.
File metadata
- Download URL: link-duplicates-1.0.0.tar.gz
- Upload date:
- Size: 7.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 20942f16e40975f4df54b3f5ab3aac4d0a4cb69942f8aba135f32a9be76b2cd5 |
|
MD5 | de42a0ee12f6391d937ccbc2d60722aa |
|
BLAKE2b-256 | 636254784462c2d4535bc8f3e662378724fb5a8d86dd6f0a6640dbe99af51eea |
File details
Details for the file link_duplicates-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: link_duplicates-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 72d037abc38c5156cb5c4e130344ba50d72b5dc37882e20c88b6bca8548fbd47 |
|
MD5 | 44b4da32fba702e049935bfe8c1cd518 |
|
BLAKE2b-256 | 3f1c106defaa05d38633de9b81cd398fbad8f7f9d7a2e3d23fdb9087850d5db3 |