Skip to main content

Linux system backups powered by rsync

Project description

https://travis-ci.org/xolox/python-rsync-system-backup.svg?branch=master https://coveralls.io/repos/xolox/python-rsync-system-backup/badge.svg?branch=master

The rsync-system-backup program uses rsync to create full system backups of Linux systems. Supported backup destinations include local disks (possibly encrypted using LUKS) and remote systems that are running an SSH server or rsync daemon. Each backup produces a timestamped snapshot and these snapshots are rotated according to a rotation scheme that you can configure. The package is currently tested on cPython 2.7, 3.4, 3.5, 3.6, 3.7 and PyPy (2.7).

Status

While this project brings together more than ten years of experience in creating (system) backups using rsync, all of the actual Python code was written in the first few months of 2017 and has seen limited real world use. The project does however have an automated test suite with more than 90% test coverage and my intention is to extend the test coverage further.

In May 2018 I changed the status from alpha to beta as part of release 1.0. The bump in major version number was triggered by a backwards incompatible code change, however at that point I had been using rsync-system-backup to make local backups of several of my Linux systems for the majority of a year. Also several colleagues of mine have used the how-to on setting up unattended backups to an encrypted USB disk.

Installation

The rsync-system-backup package is available on PyPI which means installation should be as simple as:

$ pip install rsync-system-backup

There’s actually a multitude of ways to install Python packages (e.g. the per user site-packages directory, virtual environments or just installing system wide) and I have no intention of getting into that discussion here, so if this intimidates you then read up on your options before returning to these instructions ;-).

Usage

There are two ways to use the rsync-system-backup package: As the command line program rsync-system-backup and as a Python API. For details about the Python API please refer to the API documentation available on Read the Docs. The command line interface is described below.

Command line

Usage: rsync-system-backup [OPTIONS] [SOURCE] DESTINATION

Use rsync to create full system backups.

The required DESTINATION argument specifies the (possibly remote) location where the backup is stored, in the syntax of rsync’s command line interface. The optional SOURCE argument defaults to ‘/’ which means the complete root filesystem will be included in the backup (other filesystems are excluded).

Please use the --dry-run option when getting familiar with this program and don’t remove it until you’re confident that you have the right command line, because using this program in the wrong way can cause data loss (for example by accidentally swapping the SOURCE and DESTINATION arguments).

Supported locations include:

  • Local disks (possibly encrypted using LUKS).

  • Remote systems that allow SSH connections.

  • Remote systems that are running an rsync daemon.

  • Connections to rsync daemons tunneled over SSH.

The backup process consists of several steps:

  1. First rsync is used to transfer all (relevant) files to a destination directory (whether on the local system or a remote system). Every time a backup is made, this same destination directory is updated.

  2. After the files have been transferred a ‘snapshot’ of the destination directory is taken and stored in a directory with a timestamp in its name. These snapshots are created using ‘cp --archive --link’.

  3. Finally the existing snapshots are rotated to purge old backups according to a rotation scheme that you can customize.

Supported options:

Option

Description

-b, --backup

Create a backup using rsync but don’t create a snapshot and don’t rotate old snapshots unless the --snapshot and/or --rotate options are also given.

-s, --snapshot

Create a snapshot of the destination directory but don’t create a backup and don’t rotate old snapshots unless the --backup and/or --rotate options are also given.

This option can be used to create snapshots of an rsync daemon module using a ‘post-xfer exec’ command. If DESTINATION isn’t given it defaults to the value of the environment variable $RSYNC_MODULE_PATH.

-r, --rotate

Rotate old snapshots but don’t create a backup and snapshot unless the --backup and/or --snapshot options are also given.

This option can be used to rotate old snapshots of an rsync daemon module using a ‘post-xfer exec’ command. If DESTINATION isn’t given it defaults to the value of the environment variable $RSYNC_MODULE_PATH.

-m, --mount=DIRECTORY

Automatically mount the filesystem to which backups are written.

When this option is given and DIRECTORY isn’t already mounted, the ‘mount’ command is used to mount the filesystem to which backups are written before the backup starts. When ‘mount’ was called before the backup started, ‘umount’ will be called when the backup finishes.

An entry for the mount point needs to be defined in /etc/fstab for this to work.

-c, --crypto=NAME

Automatically unlock the encrypted filesystem to which backups are written.

When this option is given and the NAME device isn’t already unlocked, the cryptdisks_start command is used to unlock the encrypted filesystem to which backups are written before the backup starts. When cryptdisks_start was called before the backup started, cryptdisks_stop will be called when the backup finishes.

An entry for the encrypted filesystem needs to be defined in /etc/crypttab for this to work. If the device of the encrypted filesystem is missing and rsync-system-backup is being run non-interactively, it will exit gracefully and not show any desktop notifications.

If you want the backup process to run fully unattended you can configure a key file in /etc/crypttab, otherwise you will be asked for the password each time the encrypted filesystem is unlocked.

-t, --tunnel=TUNNEL_SPEC

Connect to an rsync daemon through an SSH tunnel. This provides encryption for rsync client to daemon connections that are not otherwise encrypted. The value of TUNNEL_SPEC is expected to be an SSH alias, host name or IP address. Optionally a username can be prefixed (followed by ‘@’) and/or a port number can be suffixed (preceded by ‘:’).

-i, --ionice=CLASS

Use the ‘ionice’ program to set the I/O scheduling class and priority of the ‘rm’ invocations used to remove backups. CLASS is expected to be one of the values ‘idle’, ‘best-effort’ or ‘realtime’. Refer to the man page of the ‘ionice’ program for details about these values.

-u, --no-sudo

By default backup and snapshot creation is performed with superuser privileges, to ensure that all files are readable and filesystem metadata is preserved. The -u, --no-sudo option disables the use of ‘sudo’ during these operations.

-n, --dry-run

Don’t make any changes, just report what would be done. This doesn’t create a backup or snapshot but it does run rsync with the --dry-run option.

--multi-fs

Allow rsync to cross filesystem boundaries. This option has the opposite effect of the rsync option --one-file-system because rsync-system-backup defaults to running rsync with --one-file-system and must be instructed not to using --multi-fs.

-x, --exclude=PATTERN

Selectively exclude certain files from being included in the backup. Refer to the rsync documentation for allowed PATTERN syntax. Note that rsync-system-backup always uses the ‘rsync --one-file-system’ option.

-f, --force

By default rsync-system-backup refuses to run on non-Linux systems because it was designed specifically for use on Linux. The use of the -f, --force option sidesteps this sanity check. Please note that you are on your own if things break!

--disable-notifications

By default a desktop notification is shown (using notify-send) before the system backup starts and after the backup finishes. The use of this option disables the notifications (notify-send will not be called at all).

-v, --verbose

Make more noise (increase logging verbosity). Can be repeated.

-q, --quiet

Make less noise (decrease logging verbosity). Can be repeated.

-h, --help

Show this message and exit.

How it works

I’ve been finetuning my approach to Linux system backups for years now and during that time rsync has become my swiss army knife of choice :-). I also believe that comprehensive documentation can be half the value of an open source project. The following sections attempt to provide a high level overview of my system backup strategy:

The (lack of) backup format

Each backup is a full copy of the filesystem tree, stored in the form of individual files and directories on the destination. This “backup format” makes it really easy to navigate through and recover from backups because you can use whatever method you are comfortable with, whether that is a file browser, terminal, Python script or even chroot :-).

Rotation of snapshots

Snapshots can be rotated according to a flexible rotation scheme, for example I’ve configured my laptop backup rotation to preserve the most recent 24 hourly backups, 30 daily backups and endless monthly backups.

Backup destinations

While developing, maintaining and evolving backup scripts for various Linux laptops and servers I’ve learned that backups for different systems require different backup destinations and connection methods:

Encrypted USB disks

There’s a LUKS encrypted USB disk on my desk at work that I use to keep hourly, daily and monthly backups of my work laptop. The disk is connected through the same USB hub that also connects my keyboard and mouse so I can’t really forget about it :-).

Automatic mounting

Before the backup starts, the encrypted disk is automatically unlocked and mounted. The use of a key file enables this process to run unattended in the background. Once the backup is done the disk will be unmounted and locked again, so that it can be unplugged at any time (as long as a backup isn’t running of course).

Local server (rsync daemon)

My personal laptop transfers hourly backups to the rsync daemon running on the server in my home network using a direct TCP connection without SSH. Most of the time the laptop has an USB Ethernet adapter connected but the backup runs fine over a wireless connection as well.

Remote server (rsync daemon over SSH tunnel)

My VPS (virtual private server) transfers nightly backups to the rsync daemon running on the server in my home network over an SSH tunnel in order to encrypt the traffic and restrict access. The SSH account is configured to allow tunneling but disallow command execution. This setup enables the rsync client and server to run with root privileges without allowing the client to run arbitrary commands on the server.

Alternative connection methods

Backing up to a local disk limits the effectiveness of backups but using SSH access between systems gives you more than you bargained for, because you’re allowing arbitrary command execution. The rsync daemon provides an alternative that does not allow arbitrary command execution. The following sections discuss this option in more detail.

Using rsync daemon

To be able to write files as root and preserve all filesystem metadata, rsync must be running with root privileges. However most of my backups are stored on remote systems and opening up remote root access over SSH just to transfer backups feels like a very blunt way to solve the problem :-).

Fortunately another solution is available: Configure an rsync daemon on the destination and instruct your rsync client to connect to the rsync daemon instead of connecting to the remote system over SSH. The rsync daemon configuration can restrict the access of the rsync client so that it can only write to the directory that contains the backup tree.

In this setup no SSH connections are used and the traffic between the rsync client and server is not encrypted. If this is a problem for you then continue reading the next section.

Enabling rsync daemon

On Debian and derivatives like Ubuntu you can enable and configure an rsync daemon quite easily:

  1. Make sure that rsync is installed:

    $ sudo apt-get install rsync
  2. Enable the rsync daemon by editing /etc/default/rsync and changing the line RSYNC_ENABLE=false to RSYNC_ENABLE=true. Here’s a one liner that accomplishes the task:

    $ sudo sed -i 's/RSYNC_ENABLE=false/RSYNC_ENABLE=true/' /etc/default/rsync
  3. Create the configuration file /etc/rsyncd.conf and define at least one module. Here’s an example based on my rsync daemon configuration:

    # Global settings.
    max connections = 4
    log file = /var/log/rsyncd.log
    
    # Defaults for modules.
    read only = no
    uid = 0
    gid = 0
    
    # Daily backups of my VPS.
    [vps_backups]
    path = /mnt/backups/vps/latest
    post-xfer exec = /usr/sbin/process-vps-backups
    
    # Hourly backups of my personal laptop.
    [laptop_backups]
    path = /mnt/backups/laptop/latest
    post-xfer exec = /usr/sbin/process-laptop-backups

    The post-xfer exec directives configure the rsync daemon to create a snapshot once the backup is done and rotate old snapshots afterwards.

  4. Once you’ve created /etc/rsyncd.conf you can start the rsync daemon:

    $ sudo service rsync start
  5. If you’re using a firewall you should make sure that the rsync daemon port is whitelisted to allow incoming connections. The rsync daemon port number defaults to 873. Here’s an iptables command to accomplish this:

    $ sudo iptables -A INPUT -p tcp -m tcp --dport 873 -m comment --comment "rsync daemon" -j ACCEPT

Tunneling rsync daemon connections

When your backups are transferred over the public internet you should definitely use SSH to encrypt the traffic, but if you’re at all security conscious then you probably won’t like having to open up remote root access over SSH just to transfer backups :-).

The alternative is to use a non privileged SSH account to set up an SSH tunnel that redirects network traffic to the rsync daemon. The login shell of the SSH account can be set to /usr/sbin/nologin (or something similar like /bin/false) to disable command execution, in this case you need to pass -N to the SSH client.

Contact

The latest version of rsync-system-backup is available on PyPI and GitHub. The documentation is hosted on Read the Docs and includes a changelog. For bug reports please create an issue on GitHub. If you have questions, suggestions, etc. feel free to send me an e-mail at peter@peterodding.com.

License

This software is licensed under the MIT license.

© 2018 Peter Odding.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rsync-system-backup-1.1.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

rsync_system_backup-1.1-py2.py3-none-any.whl (29.3 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file rsync-system-backup-1.1.tar.gz.

File metadata

  • Download URL: rsync-system-backup-1.1.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for rsync-system-backup-1.1.tar.gz
Algorithm Hash digest
SHA256 8919f2bd0e86351dfa2e942b2f9f0f3882cea35a8b79e99e79b164b308e10824
MD5 77a9fb6ebf82385419321f81a8acd79b
BLAKE2b-256 765a59bb5aca0ddf4b00a1670b777fe029d8b2f7cce1836269a463bc769182f1

See more details on using hashes here.

File details

Details for the file rsync_system_backup-1.1-py2.py3-none-any.whl.

File metadata

  • Download URL: rsync_system_backup-1.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 29.3 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.8

File hashes

Hashes for rsync_system_backup-1.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 96727367725f81f5cb23c46ec0ceae113753afe243fdd80e3d3db47704339ff9
MD5 2a8b1a54b28839a0251f97f1249cd076
BLAKE2b-256 9fcac72a7c21b060efb4343c488bed8a2febd213c7bfd6caa75af90d61de7af7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page