Skip to main content

A tool for detecting duplicate files and comparing directories.

Project description

pathdiff

PyPI - Version PyPI - Python Version

Pathdiff is a tool for:

  1. Detecting duplicate files in directories (Based on https://stackoverflow.com/a/36113168/300783)
  2. Comparing directories for differences in structure and content

There are a number of alternative tools for both usecases, but this is a simple implementation that can be easily modified (unlike complex gui tools) and provides small conveniences like progress bars (unlike more basic command line tools like diff).


Table of Contents

Installation

pip install pathdiff

Usage

❯ pathdiff find-duplicates tests/test1
Fetching files from tests/test1
|████████████████████████████████████████| 9 in 0.1s (88.59/s)
Fetching file sizes
|████████████████████████████████████████| 9/9 [100%] in 0.1s (88.10/s)
Computing small hashes
|████████████████████████████████████████| 9/9 [100%] in 0.1s (88.25/s)
Computing full hashes
|████████████████████████████████████████| 9/9 [100%] in 0.1s (88.01/s)

Duplicates found:
tests/test1/f2/file7.txt
tests/test1/f1/file4.txt


Duplicates found:
tests/test1/f1/file2.txt
tests/test1/f1/file3.txt
tests/test1/f1/file1.txt


Duplicates found:
tests/test1/f2/file8.txt
tests/test1/f2/file9.txt


Duplicates found:
tests/test1/f1/file5.txt
tests/test1/f1/file6.txt
❯ pathdiff compare-directories tests/test2/path1 tests/test2/path2
Comparing directories tests/test2/path1 and tests/test2/path2
Comparing directory structures
|████████████████████████████████████████| 28 in 0.1s (265.55/s)
Comparing common files
|████████████████████████████████████████| 9/9 [100%] in 0.1s (85.30/s)

Paths found in tests/test2/path1 but not found in tests/test2/path2:
file3-1.txt
f6/f1-1
f5/f1/file7-1.txt
f1/file3-1.txt
f1/f1/file3-1.txt

Paths found in tests/test2/path2 but not found in tests/test2/path1:
file3-2.txt
f6/f1-2
f5/f1/file7-2.txt
f1/file3-2.txt
f1/f1/file3-2.txt

Files found in tests/test2/path1 and tests/test2/path2 but contents do not match:
file2.txt
f4/f1/file6.txt
f1/file2.txt
f1/f1/file2.txt
❯ pathdiff compare-contents tests/test3/path1 tests/test3/path2
Comparing contents in directories tests/test3/path1 and tests/test3/path2
Fetching files from tests/test3/path1
|████████████████████████████████████████| 12 in 0.1s (118.00/s)
Fetching files from tests/test3/path2
|████████████████████████████████████████| 8 in 0.1s (77.02/s)
Fetching file sizes
|████████████████████████████████████████| 20/20 [100%] in 0.1s (196.16/s)
Computing small hashes
|████████████████████████████████████████| 20/20 [100%] in 0.1s (196.73/s)
Computing full hashes
|████████████████████████████████████████| 16/16 [100%] in 0.1s (155.59/s)

Files found in tests/test3/path1 but not found in tests/test3/path2 (by content, names may match):
f1/f1/file5.txt
f1/f1/file4-1.txt

Files found in tests/test3/path2 but not found in tests/test3/path1 (by content, names may match):
file5.txt
file4-2.txt

Files which do not match one-to-one or have different names:
Group of duplicate files:
Duplicates from tests/test3/path1:
file7.txt
f1/f1/file7.txt
Duplicates from tests/test3/path2:

Group of duplicate files:
Duplicates from tests/test3/path1:
file9-1.txt
f1/f1/file9.txt
Duplicates from tests/test3/path2:
file9.txt
Group of duplicate files:
Duplicates from tests/test3/path1:
file8.txt
f1/f1/file8.txt
Duplicates from tests/test3/path2:
file8.txt
Group of duplicate files:
Duplicates from tests/test3/path1:
f1/f1/file6-1.txt
Duplicates from tests/test3/path2:
file6-2.txt

License

pathdiff is distributed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pathdiff-1.0.1.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pathdiff-1.0.1-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file pathdiff-1.0.1.tar.gz.

File metadata

  • Download URL: pathdiff-1.0.1.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.24.1

File hashes

Hashes for pathdiff-1.0.1.tar.gz
Algorithm Hash digest
SHA256 3b123cffbed818d6872c409352838de2ac10c10429a824e3f138d2a523bd8b3a
MD5 46b01ae1354fae20c016e11c17b1102a
BLAKE2b-256 ca25159d9f447f30c83a55565f564b490bd7db906d04fe290ce6ff8bd9f9a6a4

See more details on using hashes here.

File details

Details for the file pathdiff-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: pathdiff-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.24.1

File hashes

Hashes for pathdiff-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d816badd10effe6af2c4d2e38104339a1f52da239a3884eef0783a08261ad8e1
MD5 897b98126c74787f165108c4e8f281f3
BLAKE2b-256 1eb84b365067f97fbc57172944b983b555da8e9979ca44921c93cac8c5fdc05b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page