Skip to main content

Backup integrity audit: verify your photos and videos are actually backed up

Project description

DriveTidy

Verify your photos and videos are actually backed up — in 113 seconds, not 2 hours.

DriveTidy is a backup integrity auditor. It answers one question fast: "For every file on this source drive, is there a copy on my backups?"

繁體中文 README ↓


Why I built this

A photographer friend showed me his backup setup: four external drives, the same shoots copied three times, color-coded labels, a spreadsheet tracking which card had been ingested where. Beautiful system. He still couldn't sleep the night before formatting an SD card.

"How do I actually know every file made it across?" he asked. "Diff tools choke on the size. rsync doesn't tell me what's missing, only what to copy. And half the time my folder names don't even match between drives — same shoot, different naming convention each backup pass."

I tried to answer that for him with a one-off Python script. The script became a tool. The tool became this.

DriveTidy walks your source drive, compares filename + size + mtime (and optionally EXIF for JPEGs) against your backup drives, and tells you exactly what's missing — grouped by folder, sorted by size, so you fix the biggest gaps first. No cloud, no subscription, no telemetry. Your file paths never leave your machine.

Now I run it before I format any card. If you've ever hesitated before a "format card" prompt, maybe you'll find a use for it too.

What it does

  • Fast: ~113 s to verify a 5 TB HDD (M1 MacBook Air, USB-C, ~2 M source files, early-stop search; full walk on the same drive takes ~2 h). Your numbers will vary with drive type, file count, and how many files are actually missing.
  • Probabilistic match with evidence: filename + size + mtime, with optional EXIF disambiguation for JPEGs
  • HTML report: missing files grouped by folder, sorted by size, so you fix the biggest gaps first
  • One-click backup of what's missing (drivetidy backup-missing): rsync the gap, additive only — never deletes
  • Both CLI and GUI: pip install drivetidy for the terminal, or download the binary for the visual version

Installation

CLI (Python users)

pip install drivetidy

Requires Python 3.11+. You also need fd or rclone on PATH for fast directory walking:

brew install fd        # macOS
apt install fd-find    # Debian/Ubuntu

GUI (everyone else)

Download the latest release for your platform from the Releases page:

  • macOS: drivetidy-macos-arm64.zip — unzip, then double-click start.command. First launch: right-click → Open to bypass Gatekeeper (the app is unsigned).
  • Windows: drivetidy-windows-x64.zip — unzip, then double-click drivetidy.exe. SmartScreen may warn; click "More info" → "Run anyway".

Quick start

# 1. Scan each drive once (builds a metadata index — fast even on HDDs)
drivetidy scan /Volumes/SD-Card --label sd-source
drivetidy scan /Volumes/Backup-1 --label backup-1
drivetidy scan /Volumes/Backup-2 --label backup-2

# 2. Audit — is everything on sd-source covered by the backups?
#    --out also writes an HTML report grouping missing files by folder.
drivetidy audit sd-source --against backup-1,backup-2 --out missing.html

# 3. (Optional) Copy the missing files to a backup — additive, never deletes.
#    The audit run id (e.g. #5) is printed at the end of step 2.
drivetidy backup-missing 5 --dest-ident backup-1 --apply

License

GPL-3.0-or-later. See LICENSE.

This means:

  • You can use, study, modify, and redistribute DriveTidy freely.
  • If you distribute a modified version, you must also release your changes under GPL-3.
  • Commercial use is allowed but the source must remain open.

Contributing

Bug reports, feature requests, and pull requests welcome. See CONTRIBUTING.md.


繁體中文 README

113 秒驗完 5TB 硬碟,幫你確認照片影片真的有備份到。

DriveTidy 是備份完整性檢查工具。它只回答一個問題:「來源硬碟上的每個檔案,在我的備份硬碟裡有沒有?」

為什麼做這個

一個攝影師朋友給我看他的備份方法:四顆外接硬碟、同一個 shoot 拷三份、五顏六色的標籤、一份 Excel 追蹤哪張卡進到哪顆硬碟。漂亮的系統。但每次要格式化 SD 卡前,他還是睡不好。

「我怎麼確定每個檔案真的都複製過去了?」他問我。「Diff 工具吃不了這個量、rsync 只告訴你它要複製什麼、不告訴你你缺什麼、而且我半數時候連備份硬碟之間的資料夾名稱都對不起來——同一個 shoot 每次備份命名都不一樣。」

我用 Python 寫一隻小腳本想幫他解決,腳本變成工具、工具變成這個。

DriveTidy 掃描你的來源硬碟,用「檔名 + 大小 + 修改時間」(JPEG 可選 EXIF 加強)跟你的備份硬碟比對,明確告訴你哪些檔案沒備份到——依資料夾分組、依大小排序,先補最大的洞。不上雲、不訂閱、不傳送任何資料,你的檔案路徑永遠留在自己電腦上。

現在我每次格式化記憶卡前都會先跑一次。如果你也曾在「確定要格式化嗎」對話框前猶豫過,也許這個工具對你有用。

它做什麼

  • :~113 秒驗完 5TB HDD(M1 MacBook Air、USB-C、~200 萬筆檔案、早停搜尋;完整 walk 同一顆要 ~2 小時)。實際速度看你的硬碟類型、檔案數量、跟漏多少檔。
  • 多重比對 + 證據:檔名 + 大小 + 修改時間,JPEG 可選用 EXIF 加強判定
  • HTML 報告:缺檔案依資料夾分組、依大小排序,先補最大的洞
  • 一鍵補檔drivetidy backup-missing):rsync 缺的檔,只新增不刪除
  • CLI 跟 GUI 兩種版本:開發者 pip install drivetidy;一般人下載安裝檔

安裝

CLI(指令版)

pip install drivetidy

需要 Python 3.11+,並安裝 fdrclone

brew install fd        # macOS

GUI(視窗版)

Releases 頁面 下載:

  • macOSdrivetidy-macos-arm64.zip — 解壓後雙擊 start.command第一次打開要「右鍵 → 打開」 才能繞過 Gatekeeper(App 沒簽章)
  • Windowsdrivetidy-windows-x64.zip — 解壓後雙擊 drivetidy.exe。SmartScreen 可能會跳警告,點「其他資訊」→「仍要執行」

快速上手

# 1. 各硬碟先各 scan 一次(建索引,HDD 也很快)
drivetidy scan /Volumes/SD卡 --label sd-source
drivetidy scan /Volumes/備份-1 --label backup-1
drivetidy scan /Volumes/備份-2 --label backup-2

# 2. 比對 — sd-source 的東西兩顆備份是否全包到?
#    加 --out 可同時產出缺檔 HTML 報告(按資料夾分組)
drivetidy audit sd-source --against backup-1,backup-2 --out 缺檔報告.html

# 3. (選用)把缺的檔複製到某顆備份(只新增不刪除)
#    audit 結束會印出 run id(譬如 #5),下面用那個 id
drivetidy backup-missing 5 --dest-ident backup-1 --apply

授權

GPL-3.0-or-later,詳見 LICENSE

意思是:

  • 你可以自由使用、研究、修改、再散佈 DriveTidy
  • 但散佈修改版時,也要用 GPL-3 開源
  • 可以商用,但程式碼必須保持開源

貢獻

歡迎 issue、PR、功能建議。詳見 CONTRIBUTING.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

drivetidy-1.0.2.tar.gz (101.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

drivetidy-1.0.2-py3-none-any.whl (85.2 kB view details)

Uploaded Python 3

File details

Details for the file drivetidy-1.0.2.tar.gz.

File metadata

  • Download URL: drivetidy-1.0.2.tar.gz
  • Upload date:
  • Size: 101.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for drivetidy-1.0.2.tar.gz
Algorithm Hash digest
SHA256 63d58d7452a028be52334c29d86030f5116b969262c90d5c970c2a629603a9f3
MD5 ee6563cf6796bef38933bef7570442a3
BLAKE2b-256 48ba1f5f90a3c0b5bf57bd8c02dd70fa7248313989233ef36df0463c5c061a2d

See more details on using hashes here.

Provenance

The following attestation bundles were made for drivetidy-1.0.2.tar.gz:

Publisher: cli-release.yml on adom-near/drivetidy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file drivetidy-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: drivetidy-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 85.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for drivetidy-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 53d7fb364418cb9b0a79f1d0e2e03c2fe8059be1c23b3e0fab78f6a6e0979d58
MD5 eea37b2b7fbf55e7bc2cf3f907f91751
BLAKE2b-256 01307b3335e0226bf0e025905ec47ba8241e4093f6c43a8acaf854d0e5911e00

See more details on using hashes here.

Provenance

The following attestation bundles were made for drivetidy-1.0.2-py3-none-any.whl:

Publisher: cli-release.yml on adom-near/drivetidy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page