Skip to main content

A backup utility based on snapshots and rsync.

Project description

blue-backup

A backup utility based on snapshots and rsync.

blue-backup organizes backups in snapshot folders. Each snapshot folder has a complete copy of the backed up data, meaning that no special software is required to view or restore the backup.

Copy-On-Write (COW) or hard-links are used to create the snapshots. Therefore, each snapshot only consumes the disk space required to store the changes since the last snapshot. If the target file-system is btrfs, blue-backup uses btrfs COW based snapshots. Otherwise, it relies on hard-link copies.

blue-backup relies on the versatility and efficiency of rsync to keep the backup synchronized. blue-backup was inspired by rsnapshot. The main difference is that blue-backup labels snapshots by their dates instead of the daily.0 style labels used by rsnapshot.

blue-backup supports three modes of operation:

Installation

blue-backup can be installed from pypi:

pip3 install blue-backup

Also, because it is a single python script, it can be copied directly to your path:

git clone https://github.com/udifuchs/blue-backup.git
sudo cp blue-backup/blue-backup /usr/local/bin/

blue-backup depends on paramiko to access remote backup targets using SSH. If your backup target is local, paramiko is not required. If you are using python 3.10 or earlier, you will also need to install tomli. On Debian/Ubuntu these python packages can be installed using:

apt install python3-paramiko python3-tomli

Snapshot backups {#snapshot-mode}

Setting up a snapshot backup requires a TOML configuration file. A minimal configuration file would look like:

target-location = "/mnt/blue/backup/{TODAY}"

[backup-folders]
"/home" = {}

target-location specifies the folder in which the backup snapshots will be stored. This folder can be local or remote. The {TODAY} keyword in the target location indicates that this is a snapshot backup. It would be replaced with the current date in the format YYYY-MM-DD.

Each entry in the [backup-folders] table specifies a source folder to be backed up. Each entry has a sub-table with its options. An empty table {} indicates the use of default options.

The next blue-snapshot.toml example, demonstrates other configuration possibilities:

target-location = "/mnt/blue/backup/{TODAY}"

rsync-options = [ "--numeric-ids" ]

# Global exclude list applied to all backup-folders:
exclude = [ "cache", ".cache", "__pycache__" ]

[backup-folders]
"/home" = {}
"/etc" = {}
"/var" = { exclude = [ "/log", "/tmp" ] }

"user1@pc1:/home" = { target = "pc1/home", chown = "user1:users-group", chmod = "550" }
"user1@pc1:/etc" = { target = "pc1/etc", chown = "user1:users-group", chmod = "550" }

[backup-folders."user7@win-pc7:/home"]
target = "my-pc/home"
chown = "user7:users-group"
chmod = "550"
rsync-options = [ "--rsync-path=c:/msys64/usr/bin/rsync.exe" ]

This example demonstrates how the [backup-folder] table can also point to remote folders.

exclude in the main TOML section specifies excluded folders that would be applied to all backup folders. These folders are added as an rsync --exclude option and follow rsync rules for this option.

The target field in this sub-table is appended to the target-location. If target is not specified, it would be the same as the source folder. For example, /home folder will be backed up to /mnt/blue/backup/YYYY-MM-DD/home. For remote folders a target must be specified.

Notice that source and remote folders cannot be both remote. At least one of them has to be local. This is a limitation of rsync.

chown and chmod control the owner and protection mode bits of the backup files. It is mostly useful when backing up folders on remote computers.

rsync-options allow for additional options to the rsync commands. These options are added to the rsync options that blue-backup applies on its own.

The backup is started with the command:

$ blue-backup --first-time blue-snapshot.toml

The output of this command should like this (the table format was inspired by rsnapreport):

Backup snapshot target: /mnt/blue/backup/2020-02-02 at 00:00:00-00:00
Target     | Total files /   bytes | Transferred /  bytes |    Time
-----------+-----------------------+----------------------+--------
home       |     123,456 / 123.45G |    123,456 / 123.45G | 1:23:45
etc        |       1,234 /  12.34M |      1,234 /  12.34M | 0:00:12
var        |      12,345 / 234.56M |     12,345 / 234.56M | 0:01:23
pc1/home   |     543,210 / 345.67G |    543,210 / 345.67G | 0:23:45
pc1/etc    |      54,321 /  34.56M |     54,321 /  34.56M | 0:02:34
my-pc/home |     765,432 / 234.56G |    765,432 / 234.56G | 0:34:56
Kept backups: 1 monthly, 0 daily
Target device usage: 2.3T / 3.7T (63%) available: 1.4T

The --first-time option must be specified on first invocation. This informs blue-backup that there should be no existing dated snapshot folders and that a full backup should be made. The --first-time option must not be specified on subsequent invocations. This informs blue-backup to expect at least one existing snapshot and that it should be making an incremental backup.

In practice, after the first run, subsequent runs should be automatically scheduled with a tool such as crontab. For example by adding the line:

37 * * * * /usr/local/bin/blue-backup /path-to/blue-snapshot.toml >> /var/log/blue-snapshot.log 2>&1

In this example blue-backup is invoked once an hour. This means an incremental backup once an hour. But since the snapshot name is based on the date, a new snapshot is created only once a day. Therefore, if you notice that you lost a file 2 hours ago, you will probably won't be able to restore it from today's backup. But you should be able to restore it from yesterday's backup. On the other hand, if your disk crashes, you should have a fresh backup from the last hour.

blue-backup keeps snapshots of the last 20 daily backups. It also keeps the first snapshot of every month. Older daily backups are purged. Monthly backups are never purged. It is your responsibility to decide when to remove older monthly backups.

Collecting backups to a central server {#collect-mode}

blue-backup's collection mode is used to synchronize files from several remote computers to a central server. Strictly speaking, this mode by itself is not a backup. If a file gets corrupt on a remote computer, collection mode would synchronize this corrupt file to the central server. But assuming that the central server is being properly backed up, the corrupt file could be restored from that backup.

The configuration file for collection mode is similar to the one for snapshot mode. The only difference is that target-location does not contain the {TODAY} keyword. Here is a sample configuration file for collection mode:

target-location = "/data/backup/"

[backup-folders]
"user1@pc1:/home" = { target = "pc1/home", chown = "user1:users", chmod = "550" }
"user1@pc1:/etc" = { target = "pc1/etc", chown = "user1:users", chmod = "550" }
"user2@pc2:/home" = { target = "pc2/home", chown = "user2:users", chmod = "550" }
"user2@pc2:/etc" = { target = "pc2/etc", chown = "user2:users", chmod = "550" }


[backup-folders."user4@win-pc3:/c/Users/user3/Documents"]
target = "win-pc3/Documents"
chown = "user3:user"
chmod = "550"
rsync-options = [ "--rsync-path=c:/msys64/usr/bin/rsync.exe" ]

The backup is started with the command:

$ blue-backup blue-collect.toml

There is no need to specify the --first-time option.

Off-site secondary backups {#offsite-mode}

A good backup policy is to have at least two backup copies in two physically distinct locations. This is what the offsite mode is for.

The configuration file for offsite mode is similar to the one for snapshot mode. The difference is that instead of the {TODAY} keyword, one has to specify the {LATEST} keyword both in the target-location and in the backup-folders source folder. Here is a sample configuration file for offsite mode:

target-location = "root@offsite-server:/blue/offsite/{LATEST}"

rsync-options = [ "--numeric-ids" ]

[backup-folders]
"/mnt/blue/backup/{LATEST}" = { target = "" }

blue-backup would look up the latest snapshot in the source folder and create a copy in the target location. Like in snapshot mode, --first-time must be specified on first invocation to let blue-backup know that it needs to create a full copy. In subsequent invocations, --first-time must not be specified and blue-backup would create an incremental copy.

Operating system

blue-backup was developed on a linux operating system. Running on linux is not strictly a requirement, but it was not tested in other environments. blue-backup assumes that the OS of the target location has the following executables:

  • mkdir
  • rm
  • cp - uses hard-links
  • mv
  • stat
  • sync
  • df
  • btrfs - only required if backing up to a btrfs partition

python3 (version 3.8 or newer) must be available on the host running the blue-backup script.

rsync must be installed on both the source and target hosts.

The source host can be an MS-Windows OS. I installed rsync on MS-Windows using msys2. From the msys2 shell, install rsync:

pacman -S rsync

You will also need to install OpenSSH.Server.

Finally, you should add to the source folder in blue-backup TOML configuration file the entry:

rsync-options = [ "--rsync-path=c:/msys64/usr/bin/rsync.exe" ]

Tests

Tests are run using tox.

There are two special requirements for running tests:

  1. Some of the tests require ssh. Therefore, it is recommended to copy your SSH public key to your local host:

    ssh-copy-id 127.0.0.1
    
  2. The mkfs.btrfs executable is required to create the test btrfs partition. On Debian/Ubuntu it is installed using:

    sudo apt install btrfs-progs
    

    In order to mount the btrfs partition without root privileges, udisksctl is used. This works as long as you are logged-in to a desktop environment. If you get an error like:

    Error setting up loop device for btrfs.img: GDBus.Error:org.freedesktop.UDisks2.Error.NotAuthorizedCanObtain: Not authorized to perform operation
    

    Then you can either run tox as sudo (not very recommended) or ignore the btrfs tests failures.

History

1.0.0 (2026-01-??)

  • Initial release.

Project details


Release history Release notifications | RSS feed

This version

1.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blue_backup-1.0.tar.gz (24.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

blue_backup-1.0-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file blue_backup-1.0.tar.gz.

File metadata

  • Download URL: blue_backup-1.0.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for blue_backup-1.0.tar.gz
Algorithm Hash digest
SHA256 7a74d7640cf35737ca5b9a0ca152de8a17f5e754059e81010e3302cacadc6acd
MD5 bb3f4fa8f2232faacbbb16dd154cc3a1
BLAKE2b-256 50d9d13081904f04f30a4bebf71c5846cf5ca1ae1346a45221ebc15f24cb68eb

See more details on using hashes here.

File details

Details for the file blue_backup-1.0-py3-none-any.whl.

File metadata

  • Download URL: blue_backup-1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for blue_backup-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 54bbdf2f04d6b0014c8bbb0d4121e9c30611c1709fafb64d1682a51dc2f156c9
MD5 04d43271920212717e8ba14de49e0b77
BLAKE2b-256 370ff2de233edb337184da2a01d26eb50732ed031202bd3707f7fac460a37f64

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page