Skip to main content

A tool for preserving files with path normalization and verification

Project description

Preserve

Version Python License Platform

Track files and always get them back to where you need them.

A cross-platform file preservation tool with path normalization, verification, and restoration capabilities. Preserve tracks where your files came from so you can always restore them - whether to their original location or anywhere else you need them.

Why another backup tool?

Have you ever had a situation where you ran out of space on a hard-drive and needed to robocopy / rsync / move an assortment of adhoc folders and files from one drive to another, but you needed an easy way to map those files back to the original source directory later?

Or perhaps you've needed to copy a batch of folders from one machine to multiple lab computers that all shared a similar folder layout, and you got tired of manually copying the folders one-by-one for each box when there are subtle differences (pro-tip: Syncthing and BTSync/Resilio are handy but not perfect when there are differences to sort out with Beyond Compare).

Or get frustrated with creating one-off scripts each time files needed to be distributed; and worried whether that copy actually transferred correctly and all files are intact?

Enter preserve...

Features

  • Path Preservation: Copy or move files with multiple path preservation styles:
    • Relative paths that maintain directory structure (--rel)
    • Absolute paths with drive letter preservation (--abs)
    • Flat structure with all files in one directory (--flat)
  • Verification: File integrity verification with multiple hash algorithms (MD5, SHA1, SHA256, SHA512)
  • Metadata: Preserve file attributes (timestamps, permissions, etc.)
  • Manifests: Detailed operation tracking with automatic versioning for multiple operations
  • Restoration: Restore files to their original locations with verification
  • DazzleLink: Optional integration with dazzlelink for enhanced metadata storage and file references
  • Cross-Platform: Works on Windows, Linux, and macOS

Installation

pip install dazzle-preserve

Note: The PyPI package is named dazzle-preserve, but you still import and use it as preserve:

import preserve  # Import name stays the same

For full functionality on Windows, install with the Windows extras:

pip install dazzle-preserve[windows]

For dazzlelink integration:

pip install dazzle-preserve[dazzlelink]

Usage

Basic Usage

Most common: Backup an entire directory with all subdirectories

preserve COPY "C:/my-project" --recursive --rel --includeBase --dst "E:/backup"

Important: Always use --recursive (or -r) when copying directories. Without it, preserve will only look for files directly in the source directory, not in subdirectories.

Copy specific file types from a directory tree

preserve COPY --glob "*.docx" --srchPath "C:/documents" --recursive --dst "D:/archive"

Copy files from a list (loadIncludes)

preserve COPY --loadIncludes "files-to-copy.txt" --dst "E:/backup" --rel

The files-to-copy.txt file should contain one file or directory path per line:

C:/data/report.docx
C:/projects/myapp/src/main.py
C:/photos/vacation/

Verify backup integrity

preserve VERIFY --src "C:/my-project" --dst "E:/backup" --hash SHA256

Restore files to their original locations

# Restore latest backup
preserve RESTORE --src "E:/backup"

# List all available restore points
preserve RESTORE --src "E:/backup" --list

# Restore specific operation by number
preserve RESTORE --src "E:/backup" --number 2

Path Preservation Options

  • --rel: Preserve relative paths
  • --abs: Preserve absolute paths (with drive letter as directory)
  • --flat: Flatten directory structure (all files in destination root)
  • --includeBase: Include base directory name in destination path

Other Options

  • --hash: Specify hash algorithm(s) for verification (MD5, SHA1, SHA256, SHA512)
  • --verify: Verify files after operation
  • --dazzlelink: Create dazzlelinks to original files
  • --dry-run: Show what would be done without making changes
  • --overwrite: Overwrite existing files in destination

See preserve --help for full documentation and examples.

Working with Multiple Operations

Preserve automatically manages manifests when you run multiple operations to the same destination:

Sequential Operations Example

# First operation - copies dataset A
preserve COPY "C:/data/dataset-A" -r --includeBase --rel --dst "E:/backup"
# Creates: preserve_manifest.json

# Second operation - copies dataset B
preserve COPY "C:/data/dataset-B" -r --includeBase --rel --dst "E:/backup"
# Auto-migrates first manifest to preserve_manifest_001.json
# Creates: preserve_manifest_002.json

# Third operation - copies dataset C
preserve COPY "C:/data/dataset-C" -r --includeBase --rel --dst "E:/backup"
# Creates: preserve_manifest_003.json

Managing Multiple Manifests

# List all available restore points
preserve RESTORE --src "E:/backup" --list
# Output:
#   1. preserve_manifest_001.json (2025-09-18 14:30:00, 150 files)
#   2. preserve_manifest_002.json (2025-09-18 14:55:00, 75 files)
#   3. preserve_manifest_003.json (2025-09-18 15:20:00, 200 files)

# Restore specific dataset (e.g., dataset B from operation 2)
preserve RESTORE --src "E:/backup" --number 2

# Restore latest operation (dataset C)
preserve RESTORE --src "E:/backup"

# Restore with short option
preserve RESTORE --src "E:/backup" -n 1  # Restores dataset A

User-Friendly Manifest Naming

You can rename manifests to include descriptions:

# Rename manifests for clarity (Windows)
ren preserve_manifest_001.json preserve_manifest_001__dataset-A.json
ren preserve_manifest_002.json preserve_manifest_002__dataset-B.json

# On Linux/Mac
mv preserve_manifest_001.json preserve_manifest_001__dataset-A.json

# The descriptions appear in --list output:
preserve RESTORE --src "E:/backup" --list
#   1. preserve_manifest_001__dataset-A.json - dataset-A (2025-09-18 14:30:00, 150 files)
#   2. preserve_manifest_002__dataset-B.json - dataset-B (2025-09-18 14:55:00, 75 files)

Important Notes

  • No Overwrites: Each operation creates a new manifest, preserving all history
  • Backward Compatible: Single operations still work exactly as before
  • Auto-Migration: The system automatically handles the transition from single to multiple manifests
  • Independent Restoration: Each manifest can be restored independently

Recommended Workflow for Important Data

For data you really care about, follow this multi-step workflow to ensure end-to-end data integrity:

Step 1: Pre-Verification (Create baseline hashes)

# Windows: Use certutil to create SHA256 hashes
cd C:\my-project
for /r %i in (*) do certutil -hashfile "%i" SHA256 >> ..\source-hashes.txt

# Linux/Mac: Use shasum or sha256sum
cd /path/to/my-project
find . -type f -exec sha256sum {} \; > ../source-hashes.txt

Step 2: Copy with Structure Preservation

# Copy with relative paths and include base directory
preserve COPY "C:/my-project" --recursive --rel --includeBase --dst "E:/backup" --hash SHA256

Step 3: Post-Copy Verification

# Verify all files match their source
preserve VERIFY --src "C:/my-project" --dst "E:/backup" --hash SHA256 --report verify-report.txt

# Check the report for any mismatches
type verify-report.txt

Step 4: Test Restoration

# First, do a dry run to see what would be restored
preserve RESTORE --src "E:/backup" --dry-run

# If everything looks correct, restore to a test location
preserve RESTORE --src "E:/backup" --dst "C:/test-restore" --verify

Understanding RESTORE --dst Behavior

When using --dst to restore to a different location, preserve maintains the backup's directory structure, not the original source structure. This respects the path style choice (--rel, --abs, --flat) made during backup creation.

Examples:

  1. Relative with Base Directory (--rel --includeBase):

    # Backup created with:
    preserve COPY my-project/ --dst backup/ --rel --includeBase
    # Creates: backup/my-project/file.txt
    
    # Restore to new location:
    preserve RESTORE --src backup/ --dst restored/
    # Result: restored/my-project/file.txt
    
  2. Flat Structure (--flat):

    # Backup created with:
    preserve COPY my-project/ --dst backup/ --flat
    # Creates: backup/file.txt (no subdirectories)
    
    # Restore to new location:
    preserve RESTORE --src backup/ --dst restored/
    # Result: restored/file.txt (maintains flat structure)
    
  3. Absolute Paths (--abs):

    # Backup created with:
    preserve COPY C:/data/file.txt --dst backup/ --abs
    # Creates: backup/C/data/file.txt
    
    # Restore to new location:
    preserve RESTORE --src backup/ --dst restored/
    # Result: restored/C/data/file.txt (preserves full path structure)
    

Key Point: The restored structure mirrors what's in your backup directory, preserving your original backup organization choice.

Step 5: Validate Restoration

# Compare restored files with original hashes
# Windows
cd C:\test-restore\my-project
for /r %i in (*) do certutil -hashfile "%i" SHA256 >> ..\..\restored-hashes.txt
fc ..\..\source-hashes.txt ..\..\restored-hashes.txt

# Linux/Mac
cd /path/to/test-restore/my-project
find . -type f -exec sha256sum {} \; > ../../restored-hashes.txt
diff ../../source-hashes.txt ../../restored-hashes.txt

Step 6: Source Cleanup (Only after validation)

# Only remove originals after all verifications pass
# Keep the preserve_manifest*.json files in the backup for future restoration

Important: Never delete your source files until you've verified the backup AND successfully tested restoration.

What's New

See CHANGELOG.md for a detailed history of changes.

Latest Release (v0.7.x)

v0.7.x focuses on destination awareness and safety:

  • Pre-operation destination scanning to detect conflicts before COPY/MOVE
  • Deep path cycle detection to prevent data loss from nested junctions/symlinks
  • CLEANUP command for recovering from interrupted MOVE operations
  • Smart path mode warnings to catch common mistakes

Recent Highlights

  • Deep Cycle Detection (v0.7.2): Walks source tree to find nested junctions/symlinks that could cause data loss
  • Link Handling Modes (v0.7.3): --link-handling skip|unlink for moving directories containing links
  • Destination Awareness (v0.7.0): Scan destination before operations, --on-conflict modes
  • CLEANUP Command (v0.7.0): Recover from interrupted MOVE with --mode complete|rollback
  • Link Creation (v0.6.0): Create junctions/symlinks after MOVE with -L junction
  • Advanced Filtering: Exclude patterns, depth control, time-based selection
  • Three-Way Verification: Source, destination, or both forms of verifications during restore operations
  • Sequential Manifests: Support for multiple operations to same destination

Contributing

Contributions are welcome! Feel free to submit a pull request.

Like the project?

"Buy Me A Coffee"

Acknowledgments

  • dazzlelink - Enhanced metadata storage and file references
  • GitRepoKit - Automated version management system
  • Community contributors - Testing, feedback, and improvements

License

preserve, aka preserve.py, Copyright (C) 2025 Dustin Darcy

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dazzle_preserve-0.7.3.tar.gz (176.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dazzle_preserve-0.7.3-py3-none-any.whl (155.2 kB view details)

Uploaded Python 3

File details

Details for the file dazzle_preserve-0.7.3.tar.gz.

File metadata

  • Download URL: dazzle_preserve-0.7.3.tar.gz
  • Upload date:
  • Size: 176.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dazzle_preserve-0.7.3.tar.gz
Algorithm Hash digest
SHA256 4890214eac8586662ba93a0846d2b50b78a35222ba8a69fe304063619a37e388
MD5 8f4f2637f99f1362e552bba7f465cafd
BLAKE2b-256 6acaad615d66a2b6cd74242aa609812f8f6fdf53f3b2fcfe1ea4b8a402784756

See more details on using hashes here.

File details

Details for the file dazzle_preserve-0.7.3-py3-none-any.whl.

File metadata

File hashes

Hashes for dazzle_preserve-0.7.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1aef4883efec3f70d4787e1a50899cc0a499d04225f12fd362b038d579f227a6
MD5 349e72785d9eee69c54c30ede353a0fc
BLAKE2b-256 8a637456354ad7d2fc23c3f6ae667b50e5a95e69dfefb5226fb1e48e56c02bdc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page