Skip to main content

Deduplicated disk image transfer over SSH

Project description

diskdump

Deduplicated disk image transfer over SSH. Transfers only unique blocks across machines and over time, stores them compressed in a content-addressable block store.

diskdump overview

How it works

  1. Client script is deployed to remote machine via SCP
  2. Client reads the disk/file in configurable batches (default 10MB)
  3. For each batch: hashes blocks (SHA256), asks server which are needed
  4. Only new blocks are sent (zlib-compressed on wire, lz4-compressed for storage)
  5. Each dump is stored as a lightweight manifest file pointing into a shared global block store

Single-pass streaming — no second read of the source. Blocks stay in memory only for one batch.

Usage

# Dump from remote machines (parallel, per-target --sudo)
diskdump dump server01:/dev/sda --sudo --as server01.sda server02:/dev/sda --as server02.sda

# Dump files (not block devices)
diskdump dump localhost:/path/to/file.img --as my-disk.img

# Restore
diskdump restore 2026/04/24/server01-sda.manifest -o restored.img
diskdump restore 2026/04/24/server01-sda.manifest | dd of=/dev/sda

# Import existing raw image into block store
diskdump import existing.img --as server01:/dev/sda

# Info & management
diskdump info 2026/04/24/server01-sda.manifest
diskdump status
diskdump verify
diskdump gc              # remove unreferenced blocks
diskdump gc --dry-run

Options

Flag Description
--sudo Use sudo on preceding target (repeatable per-target)
--user, -u SSH user
--block-size Block size in bytes (default 131072 = 128KB)
--batch-size Blocks per batch (default 80 = 10MB)
--as Custom manifest name (follows target it applies to)

Storage layout

disks/
  .blockstore/
    config.json
    ab/
      cd/
        abcdef...full_sha256_hash.lz4
  2026/
    04/
      24/
        server01-sda.manifest
  • Blocks are content-addressed by SHA256, stored lz4-compressed
  • Two-level directory fanout (first 2 bytes of hash = 65536 buckets)
  • Manifests are plain text: header comments + one hash per line
  • Global dedup: blocks shared across all dumps

Dependencies

  • Server (local): Python 3, lz4 (pip install lz4)
  • Client (remote): Python 3 only (stdlib — no external deps)

Setup

python3 -m venv .venv
.venv/bin/pip install -e .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diskdump-0.1.5.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diskdump-0.1.5-py3-none-any.whl (14.8 kB view details)

Uploaded Python 3

File details

Details for the file diskdump-0.1.5.tar.gz.

File metadata

  • Download URL: diskdump-0.1.5.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for diskdump-0.1.5.tar.gz
Algorithm Hash digest
SHA256 dfc3f22034fa46fea36ebb55d4edd0cab0d964ce282005884937934cdf73a1cb
MD5 69212bb1b9b02e8444abb3438760f471
BLAKE2b-256 368b3658d02dc83941f6cc6f75137d2aeafbee5bc50ba8673bcbbbb8f53a56d1

See more details on using hashes here.

Provenance

The following attestation bundles were made for diskdump-0.1.5.tar.gz:

Publisher: pypi-publish.yml on ponquersohn/diskdump

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file diskdump-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: diskdump-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 14.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for diskdump-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 dcd2f2865eeebaaba61211e155d6fa8ade17f4a8158cbb48af3a98992c9ba1b1
MD5 490366194ed5b7d75f5aa342111793e7
BLAKE2b-256 9ca2f77d2dd21f7235eb056ac5e42f5dca4ae6a6f286d3ea01cdd9da0d916054

See more details on using hashes here.

Provenance

The following attestation bundles were made for diskdump-0.1.5-py3-none-any.whl:

Publisher: pypi-publish.yml on ponquersohn/diskdump

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page