Deduplicated disk image transfer over SSH
Project description
diskdump
Deduplicated disk image transfer over SSH. Transfers only unique blocks across machines and over time, stores them compressed in a content-addressable block store.
How it works
- Client script is deployed to remote machine via SCP
- Client reads the disk/file in configurable batches (default 10MB)
- For each batch: hashes blocks (SHA256), asks server which are needed
- Only new blocks are sent (zlib-compressed on wire, lz4-compressed for storage)
- Each dump is stored as a lightweight manifest file pointing into a shared global block store
Single-pass streaming — no second read of the source. Blocks stay in memory only for one batch.
Usage
# Dump from remote machines (parallel, per-target --sudo)
diskdump dump server01:/dev/sda --sudo --as server01.sda server02:/dev/sda --as server02.sda
# Dump files (not block devices)
diskdump dump localhost:/path/to/file.img --as my-disk.img
# Restore
diskdump restore 2026/04/24/server01-sda.manifest -o restored.img
diskdump restore 2026/04/24/server01-sda.manifest | dd of=/dev/sda
# Import existing raw image into block store
diskdump import existing.img --as server01:/dev/sda
# Info & management
diskdump info 2026/04/24/server01-sda.manifest
diskdump status
diskdump verify
diskdump gc # remove unreferenced blocks
diskdump gc --dry-run
Options
| Flag | Description |
|---|---|
--sudo |
Use sudo on preceding target (repeatable per-target) |
--user, -u |
SSH user |
--block-size |
Block size in bytes (default 131072 = 128KB) |
--batch-size |
Blocks per batch (default 80 = 10MB) |
--as |
Custom manifest name (follows target it applies to) |
Storage layout
disks/
.blockstore/
config.json
ab/
cd/
abcdef...full_sha256_hash.lz4
2026/
04/
24/
server01-sda.manifest
- Blocks are content-addressed by SHA256, stored lz4-compressed
- Two-level directory fanout (first 2 bytes of hash = 65536 buckets)
- Manifests are plain text: header comments + one hash per line
- Global dedup: blocks shared across all dumps
Dependencies
- Server (local): Python 3,
lz4(pip install lz4) - Client (remote): Python 3 only (stdlib — no external deps)
Setup
python3 -m venv .venv
.venv/bin/pip install -e .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file diskdump-0.1.5.tar.gz.
File metadata
- Download URL: diskdump-0.1.5.tar.gz
- Upload date:
- Size: 17.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfc3f22034fa46fea36ebb55d4edd0cab0d964ce282005884937934cdf73a1cb
|
|
| MD5 |
69212bb1b9b02e8444abb3438760f471
|
|
| BLAKE2b-256 |
368b3658d02dc83941f6cc6f75137d2aeafbee5bc50ba8673bcbbbb8f53a56d1
|
Provenance
The following attestation bundles were made for diskdump-0.1.5.tar.gz:
Publisher:
pypi-publish.yml on ponquersohn/diskdump
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
diskdump-0.1.5.tar.gz -
Subject digest:
dfc3f22034fa46fea36ebb55d4edd0cab0d964ce282005884937934cdf73a1cb - Sigstore transparency entry: 1393444825
- Sigstore integration time:
-
Permalink:
ponquersohn/diskdump@570f47eec1a736babeaacb184425f6bebc5f4c60 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/ponquersohn
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@570f47eec1a736babeaacb184425f6bebc5f4c60 -
Trigger Event:
push
-
Statement type:
File details
Details for the file diskdump-0.1.5-py3-none-any.whl.
File metadata
- Download URL: diskdump-0.1.5-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcd2f2865eeebaaba61211e155d6fa8ade17f4a8158cbb48af3a98992c9ba1b1
|
|
| MD5 |
490366194ed5b7d75f5aa342111793e7
|
|
| BLAKE2b-256 |
9ca2f77d2dd21f7235eb056ac5e42f5dca4ae6a6f286d3ea01cdd9da0d916054
|
Provenance
The following attestation bundles were made for diskdump-0.1.5-py3-none-any.whl:
Publisher:
pypi-publish.yml on ponquersohn/diskdump
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
diskdump-0.1.5-py3-none-any.whl -
Subject digest:
dcd2f2865eeebaaba61211e155d6fa8ade17f4a8158cbb48af3a98992c9ba1b1 - Sigstore transparency entry: 1393444839
- Sigstore integration time:
-
Permalink:
ponquersohn/diskdump@570f47eec1a736babeaacb184425f6bebc5f4c60 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/ponquersohn
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@570f47eec1a736babeaacb184425f6bebc5f4c60 -
Trigger Event:
push
-
Statement type: