Skip to main content

Generate, Compare and Analyse Directory Merkle Trees

Project description

dmerk

PyPI version License

dmerk (pronounced dee-merk) is a program that creates a merkle tree for your directories.

This can be useful in many situations. For example, to detect which files were modified, or to compare two backups for duplicate files. Think hash digest / checksum verification, but instead of comparing just a pair of hash digests, we are comparing two trees of digests.

Table of Contents

Installation

pip install dmerk

Shell Autocomplete

dmerk uses argcomplete to support shell autocompletion. In order to enable this, you need to:

# zsh/bash shell(s)
activate-global-python-argcomplete

# fish shell
register-python-argcomplete --shell fish dmerk | source

Usage / Quickstart

TUI (Terminal User Interface)

Launch the TUI for a more interactive experience:

dmerk tui

The TUI is built with Textual and provides a powerful interface for all dmerk functionality. It's especially useful for the compare operation, allowing you to quickly navigate and compare different submerkles at various hierarchical levels of two merkle trees, which is more cumbersome with the CLI alone.

TUI Features

  • Keyboard navigation: Use arrow keys to navigate between widgets; Ctrl+Arrow to force navigation
  • Fuzzy filtering: Type to filter files and directories with fuzzy matching
  • Favorites sidebar: Quick access to frequently used directories (f to add, r to remove, d to reset)
  • Color-coded comparisons: Matching items highlighted with digest-based colors
  • Synchronized scrolling: Compare widgets scroll together for matching items

Generate

Generate a merkle tree for a directory:

dmerk generate /path/to/directory

Options:

  • -p, --print: Print the merkle output to stdout
  • -f FILENAME, --filename FILENAME: Provide a custom filename or file path for saving
  • --fail-on-error: Immediately fail upon encountering errors (such as broken symlinks)
  • --no-compress: Save as uncompressed JSON (.dmerk) instead of gzip (.dmerk.gz)
  • --no-save: If specified, the generated merkle tree will not be saved to file (not recommended as generating merkle trees is computationally expensive)

By default, output is saved as a gzip-compressed file with .dmerk.gz extension. This reduces file sizes by approximately 85% compared to uncompressed JSON.

Compare

Compare two directory merkle trees and return the diffs and matches:

dmerk compare -p1 PATH1 -p2 PATH2 [-sp1 SUBPATH1] [-sp2 SUBPATH2]

The paths PATH1 and PATH2 are required and can be either:

  • Paths to directories to compare
  • Paths to .dmerk or .dmerk.gz files created using the generate command

Options:

  • --no-save: If specified, the generated merkle trees will not be saved to file (only applies when comparing directories)

Examples:

dmerk compare -p1=/home/raghuram/Documents -p2=/media/raghuram/BACKUP_DRIVE/Documents
dmerk compare -p1=Documents_e6eaccb4.dmerk.gz -p2=Documents_b2a7cef7.dmerk.gz

When using .dmerk or .dmerk.gz files, you can optionally provide subpaths to compare specific subdirectories:

dmerk compare \
-p1=Documents_e6eaccb4.dmerk.gz \
-p2=Documents_b2a7cef7.dmerk.gz \
-sp1=Receipts/Rent \
-sp2=Receipts/Rent

This is particularly useful because the compare operation performs a "shallow comparison" that only shows diffs/matches among immediate children.

Features and Limitations

Current Support

  • Primary testing on Linux; Windows and macOS support coming soon™
  • Handles regular files, directories, and symlinks to regular files/directories
  • Processes hidden files and directories
  • Uses MD5 as the digest algorithm for speed (configurable options planned)
  • Gzip compression by default (~85% file size reduction)

Requirements

  • Read permission for files
  • Read and execute permissions for directories
  • File and directory names must be valid UTF-8 byte sequences
    • For support of non-UTF-8 filenames, please upvote this issue

Limitations

  • Does not support special files (character/block devices, sockets, pipes)
  • Symlinks to special files will cause exceptions
  • Directory digests are currently based only on file contents, not filenames or metadata (permissions, owner, timestamps, etc.)
    • If you need directory digests that include metadata, please open a new issue explaining your use case

Development

Contributing

If you want to report bugs, request features, or contribute improvements, please file an issue on GitHub. I appreciate your interest in this project and will respond as soon as possible 😁.

Setup

# Clone and set up the repository
git clone https://github.com/krishraghuram/dmerk.git
cd dmerk
python3 -m venv venv
source venv/bin/activate

# Install development dependencies
pip install -e .[dev]

# Verify installation
dmerk --help

Textual Development Tools

The TUI is built with Textual. You can use Textual's development tools:

# Run in dev mode with access to logs and console
textual run --dev dmerk.tui

For more information, see the Textual DevTools documentation.

Code Quality

We maintain code quality through automated checks that run as Git hooks:

  • Pre-commit: isort, lint, format, and type checking
  • Pre-push: unit tests

You can also run these checks manually:

# Run individual checks
nox --session isort
nox --session lint
nox --session format
nox --session mypy
nox --session test

Build and Publish

python -m build
python -m pip install --force-reinstall dist/dmerk-0.3.1-py3-none-any.whl
python -m twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dmerk-0.3.1.tar.gz (35.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dmerk-0.3.1-py3-none-any.whl (40.1 kB view details)

Uploaded Python 3

File details

Details for the file dmerk-0.3.1.tar.gz.

File metadata

  • Download URL: dmerk-0.3.1.tar.gz
  • Upload date:
  • Size: 35.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for dmerk-0.3.1.tar.gz
Algorithm Hash digest
SHA256 1aa0694db18c67ec022eb8c444d1b7f6ed2d6513aebf9560f97230a0b00753e1
MD5 ea1fe742d42bf43e217273da27414da5
BLAKE2b-256 0cbf928a42fabc3f41b77cf6492d82a549f8aa6aeb9daa50497d4af2b26a8488

See more details on using hashes here.

File details

Details for the file dmerk-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: dmerk-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 40.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for dmerk-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9fa4ffb33fc08df1777137600ac1a1bd18629bc911cc9cc5a0d49f46fe7c3b70
MD5 1ea14d5146342781ce855daa41a5cdce
BLAKE2b-256 ddf4f783dca16909a9967d7854ba31a8947e781d0c0fe51a7c070756e44271af

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page