Skip to main content
Donate to the Python Software Foundation or Purchase a PyCharm License to Benefit the PSF! Donate Now

bin/backup script: sensible defaults around bin/repozo

Project description

Easy zope backup/restore recipe for buildout

This recipe is a wrapper around the bin/repozo script in your zope buildout. It requires that this script is already made available. If this is not the case, you will get an error like this when you run one of the scripts: bin/repozo: No such file or directory. You should be fine when you are on Plone 3 or when you are on Plone 4 and are using plone.recipe.zeoserver. If this is not the case, the easiest way of getting a bin/repozo script is to add a new section in your buildout.cfg (do not forget to add it in the parts directive):

[repozo]
recipe = zc.recipe.egg
eggs = ZODB3
scripts = repozo

bin/repozo is a zope script to make backups of your Data.fs. Looking up the settings can be a chore. And you have to pick a directory where to put the backups. This recipe provides sensible defaults for your common backup tasks. Making backups a piece of cake is important!

  • bin/backup makes a backup.
  • bin/restore restores the latest backup.
  • bin/snapshotbackup makes a full backup, separate from the regular backups. Handy for copying the current production database to your laptop or right before a big change in the site.

Some extra information:

Attention!

If your buildout uses blobstorage to store files (see the var/blobstorage directory, if it exists), those files are currently not backed up by this recipe. You will have to do something yourself (create a script that makes a tarball, or uses scp or rsync or something like that). A future version of this recipe may deal with this.

Detailed Documentation

Example usage

Just to isolate some test differences, we run an empty buildout once:

>>> ignore = system(buildout)

The simplest way to use it to add a part in buildout.cfg like this:

>>> write('buildout.cfg',
... """
... [buildout]
... parts = backup
...
... [backup]
... recipe = collective.recipe.backup
... """)

Running the buildout adds a backup, snapshotbackup, restore and snapshotrestore scripts to the bin/ directory and, by default, it creates the var/backups and var/snapshotbackups dirs:

>>> print system(buildout) # doctest:+ELLIPSIS
Installing backup.
backup: Created /sample-buildout/var/backups
backup: Created /sample-buildout/var/snapshotbackups
Generated script '/sample-buildout/bin/backup'.
Generated script '/sample-buildout/bin/snapshotbackup'.
Generated script '/sample-buildout/bin/restore'.
Generated script '/sample-buildout/bin/snapshotrestore'.
<BLANKLINE>
>>> ls('var')
d  backups
d  snapshotbackups
>>> ls('bin')
-  backup
-  buildout
-  restore
-  snapshotbackup
-  snapshotrestore

Backup

Calling bin/backup results in a normal repozo backup. We put in place a mock repozo script that prints the options it is passed (and make it executable). It is horridly unix-specific at the moment.

>>> import sys
>>> write('bin', 'repozo',
...       "#!%s\nimport sys\nprint ' '.join(sys.argv[1:])" % sys.executable)
>>> #write('bin', 'repozo', "#!/bin/sh\necho $*")
>>> dontcare = system('chmod u+x bin/repozo')

By default, backups are done in var/backups:

>>> print system('bin/backup')
--backup -f /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/backups --gzip
INFO: Backing up database file: ...

Restore

You can restore the very latest backup with bin/restore:

>>> print system('bin/restore')
--recover -o /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/backups
INFO: Restoring...

You can restore the very latest snapshotbackup with bin/snapshotrestore:

>>> print system('bin/snapshotrestore')
--recover -o /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/snapshotbackups
INFO: Restoring...

You can also restore the backup as of a certain date. Just pass a date argument. According to repozo: specify UTC (not local) time. The format is yyyy-mm-dd[-hh[-mm[-ss]]].

>>> print system('bin/restore 1972-12-25')
--recover -o /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/backups -D 1972-12-25
INFO: Date restriction: restoring state at 1972-12-25.
INFO: Restoring...

Snapshots

For quickly grabbing the current state of a production database so you can download it to your development laptop, you want a full backup. But you shouldn’t interfere with the regular backup regime. Likewise, a quick backup just before updating the production server is a good idea. For that, the bin/snapshotbackup is great. It places a full backup in, by default, var/snapshotbackups.

>>> print system('bin/snapshotbackup')
--backup -f /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/snapshotbackups -F --gzip
INFO: Making snapshot backup:...

Names of created scripts

A backup part will normally be called [backup], leading to a bin/backup and bin/snapshotbackup. Should you name your part something else, the script names will also be different as will the created var/ directories (since version 1.2):

>>> write('buildout.cfg',
... """
... [buildout]
... parts = plonebackup
...
... [plonebackup]
... recipe = collective.recipe.backup
... """)
>>> print system(buildout) # doctest:+ELLIPSIS
Uninstalling backup.
Installing plonebackup.
backup: Created /sample-buildout/var/plonebackups
backup: Created /sample-buildout/var/plonebackup-snapshots
Generated script '/sample-buildout/bin/plonebackup'.
Generated script '/sample-buildout/bin/plonebackup-snapshot'.
Generated script '/sample-buildout/bin/plonebackup-restore'.
Generated script '/sample-buildout/bin/plonebackup-snapshotrestore'.

Note that the restore, snapshotbackup and snapshotrestore script name used when the name is [backup] is now prefixed with the part name:

>>> ls('bin')
-  buildout
-  plonebackup
-  plonebackup-restore
-  plonebackup-snapshot
-  plonebackup-snapshotrestore
-  repozo

In the var/ directory, the existing backups and snapshotbackups directories are still present. The recipe of course never removes that kind of directory! The different part name did result in two directories named after the part:

>>> ls('var')
d  backups
d  plonebackup-snapshots
d  plonebackups
d  snapshotbackups

For the rest of the tests we use the [backup] name again. And we clean up the var/plonebackups and var/plonebackup-snaphots dirs:

>>> write('buildout.cfg',
... """
... [buildout]
... parts = backup
...
... [backup]
... recipe = collective.recipe.backup
... """)
>>> dont_care = system(buildout) # doctest:+ELLIPSIS
>>> rmdir('var/plonebackups')
>>> rmdir('var/plonebackup-snapshots')

Supported options

The recipe supports the following options, none of which are needed by default. The most common one to change is location, as that allows you to place your backups in some system-wide directory like /var/zopebackups/instancename/.

location
Location where backups are stored. Defaults to var/backups inside the buildout directory.
keep
Number of full backups to keep. Defaults to 2, which means that the current and the previous full backup are kept. Older backups are removed, including their incremental backups. Set it to 0 to keep all backups.
datafs
In case the Data.fs isn’t in the default var/filestorage/Data.fs location, this option can overwrite it.
full
By default, incremental backups are made. If this option is set to ‘true’, bin/backup will always make a full backup.
debug
In rare cases when you want to know exactly what’s going on, set debug to ‘true’ to get debug level logging of the recipe itself. Repozo is also run with --verbose if this option is enabled.
snapshotlocation
Location where snapshot defaults are stored. Defaults to var/snapshotbackups inside the buildout directory.
gzip
Use repozo’s zipping functionality. ‘true’ by default. Set it to ‘false’ and repozo will notgzip its files. Note that gzipped databases are called *.fsz, not *.fs.gz. Changed in 0.8: the default used to be false, but it so totally makes sense to gzip your backups that we changed the default.
additional_filestorages
Advanced option, only needed when you have split for instance a catalog.fs out of the regular Data.fs. Use it to specify the extra filestorages. (See explanation further on).
enable_snapshotrestore
Having a snapshotrestore script is very useful in development environments, but can be harmful in a production buildout. The script restores the latest snapshot directly to your filestorage without asking any questions whatsoever. If you don’t want a snapshotrestore, set this option to false.

We’ll use all options:

>>> write('buildout.cfg',
... """
... [buildout]
... parts = backup
...
... [backup]
... recipe = collective.recipe.backup
... location = ${buildout:directory}/myproject
... keep = 2
... datafs = subfolder/myproject.fs
... full = true
... debug = true
... snapshotlocation = snap/my
... gzip = false
... enable_snapshotrestore = true
... """)
>>> print system(buildout) # doctest:+ELLIPSIS
Uninstalling backup.
Installing backup.
backup: Created /sample-buildout/myproject
backup: Created /sample-buildout/snap/my
Generated script '/sample-buildout/bin/backup'.
Generated script '/sample-buildout/bin/snapshotbackup'.
Generated script '/sample-buildout/bin/restore'.
Generated script '/sample-buildout/bin/snapshotrestore'.
<BLANKLINE>

Backups are now stored in the /myproject folder inside buildout and the Data.fs location is handled correctly despite not being an absolute path:

>>> print system('bin/backup')
--backup -f /sample-buildout/subfolder/myproject.fs -r /sample-buildout/myproject -F --verbose
INFO: Backing up database file: ...

The same is true for the snapshot backup.

>>> print system('bin/snapshotbackup')
--backup -f /sample-buildout/subfolder/myproject.fs -r /sample-buildout/snap/my -F --verbose
INFO: Making snapshot backup:...

Untested in this file, as it would create directories in your root or your home dir, are absolute links (starting with a ‘/’) or directories in your home dir or relative (../) path. They do work, of course. Also ~ and $BACKUP-style environment variables are expanded.

Cron job integration

bin/backup is of course ideal to put in your cronjob instead of a whole bin/repozo .... line. But you don’t want the “INFO” level logging that you get, as you’ll get that in your mailbox. In your cronjob, just add -q or --quiet and bin/backup will shut up unless there’s a problem.

>>> print system('bin/backup -q')
--backup -f /sample-buildout/subfolder/myproject.fs -r /sample-buildout/myproject -F --verbose
>>> print system('bin/backup --quiet')
--backup -f /sample-buildout/subfolder/myproject.fs -r /sample-buildout/myproject -F --verbose

In our case the --backup ... lines above are just the mock repozo script that still prints something. So it proves that the command is executed, but it won’t end up in the output.

Speaking of cron jobs? Take a look at zc.recipe.usercrontab if you want to handle cronjobs from within your buildout. For example:

[backupcronjob]
recipe = z3c.recipe.usercrontab
times = 0 12 * * *
command = ${buildout:directory}/bin/backup

Advanced usage: multiple Data.fs files

Sometimes, a Data.fs is split into several files. Most common reason is to have a regular Data.fs and a catalog.fs which contains the portal_catalog. This is supported with the additional_filestorages option:

>>> write('buildout.cfg',
... """
... [buildout]
... parts = backup
...
... [backup]
... recipe = collective.recipe.backup
... additional_filestorages =
...     catalog
...     another
... """)

The additional backups have to be stored separate from the Data.fs backup. That’s done by appending the file’s name and creating extra backup directories named that way:

>>> print system(buildout) # doctest:+ELLIPSIS
Uninstalling backup.
Installing backup.
backup: Created /sample-buildout/var/backups_catalog
backup: Created /sample-buildout/var/snapshotbackups_catalog
backup: Created /sample-buildout/var/backups_another
backup: Created /sample-buildout/var/snapshotbackups_another
Generated script '/sample-buildout/bin/backup'.
Generated script '/sample-buildout/bin/snapshotbackup'.
Generated script '/sample-buildout/bin/restore'.
Generated script '/sample-buildout/bin/snapshotrestore'.
<BLANKLINE>
>>> ls('var')
d  backups
d  backups_another
d  backups_catalog
d  snapshotbackups
d  snapshotbackups_another
d  snapshotbackups_catalog

The various backups are done one after the other. They cannot be done at the same time with repozo. So they are not completely in sync. The “other” databases are backed up first as a small difference in the catalog is just mildly irritating, but the other way around users can get real errors:

>>> print system('bin/backup')
--backup -f /sample-buildout/var/filestorage/catalog.fs -r /sample-buildout/var/backups_catalog --gzip
--backup -f /sample-buildout/var/filestorage/another.fs -r /sample-buildout/var/backups_another --gzip
--backup -f /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/backups --gzip
INFO: Backing up database file: ...
INFO: Backing up database file: ...
INFO: Backing up database file: ...

Same with snapshot backups:

>>> print system('bin/snapshotbackup')
--backup -f /sample-buildout/var/filestorage/catalog.fs -r /sample-buildout/var/snapshotbackups_catalog -F --gzip
--backup -f /sample-buildout/var/filestorage/another.fs -r /sample-buildout/var/snapshotbackups_another -F --gzip
--backup -f /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/snapshotbackups -F --gzip
INFO: Making snapshot backup: ...
INFO: Making snapshot backup: ...
INFO: Making snapshot backup: ...

And a restore restores all three backups:

>>> print system('bin/restore')
--recover -o /sample-buildout/var/filestorage/catalog.fs -r /sample-buildout/var/backups_catalog
--recover -o /sample-buildout/var/filestorage/another.fs -r /sample-buildout/var/backups_another
--recover -o /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/backups
INFO: Restoring...
INFO: Restoring...
INFO: Restoring...

We fake three old backups in all the (snapshot)backup directories to test if the ‘keep’ parameter is working correctly.

>>> dirs = ('var/backups', 'var/snapshotbackups')
>>> for tail in ('', '_catalog', '_another'):
...     for dir in dirs:
...         dir = dir + tail
...         for i in range(3):
...             write(dir, '%d.fs' % i, 'sample fs')
>>> print system('bin/backup')
--backup -f /sample-buildout/var/filestorage/catalog.fs -r /sample-buildout/var/backups_catalog --gzip
--backup -f /sample-buildout/var/filestorage/another.fs -r /sample-buildout/var/backups_another --gzip
--backup -f /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/backups --gzip
INFO: Backing up database file:...var/backups_catalog...
INFO: Removed old backups, the latest 2 full backups have been kept.
INFO: Backing up database file:...var/backups_another...
INFO: Removed old backups, the latest 2 full backups have been kept.
INFO: Backing up database file:...var/backups...
INFO: Removed old backups, the latest 2 full backups have been kept.
<BLANKLINE>

Now unfortunately if you would do “ls(‘var/backups’)” here in the test you would still see all three files; apparently buildout and the system do not interact correctly here, as in real life the superfluous backups are really gone. So we will have to trust the above note that old backups have been removed.

Same for the snapshot backups:

>>> print system('bin/snapshotbackup')
--backup -f /sample-buildout/var/filestorage/catalog.fs -r /sample-buildout/var/snapshotbackups_catalog -F --gzip
--backup -f /sample-buildout/var/filestorage/another.fs -r /sample-buildout/var/snapshotbackups_another -F --gzip
--backup -f /sample-buildout/var/filestorage/Data.fs -r /sample-buildout/var/snapshotbackups -F --gzip
INFO: Making snapshot backup:...var/snapshotbackups_catalog...
INFO: Removed old backups, the latest 2 full backups have been kept.
INFO: Making snapshot backup:...var/snapshotbackups_another...
INFO: Removed old backups, the latest 2 full backups have been kept.
INFO: Making snapshot backup:...var/snapshotbackups...
INFO: Removed old backups, the latest 2 full backups have been kept.
<BLANKLINE>

Test disabling the snapshotrestore script. We generate a new buildout with enable_snapshotrestore set to false. The script should not be generated now (and buildout will actually remove the previously generated script).

>>> write('buildout.cfg',
... """
... [buildout]
... parts = backup
...
... [backup]
... recipe = collective.recipe.backup
... enable_snapshotrestore = false
... """)
>>> print system(buildout) # doctest:+ELLIPSIS
Uninstalling backup.
Installing backup.
Generated script '/sample-buildout/bin/backup'.
Generated script '/sample-buildout/bin/snapshotbackup'.
Generated script '/sample-buildout/bin/restore'.

<BLANKLINE> >>> ls(‘bin’) - backup - buildout - repozo - restore - snapshotbackup

Contributors

collective.recipe.backup is basically a port of ye olde instancemanager’s backup functionality. That backup functionality was coded mostly by Reinout van Rees and Maurits van Rees, both from Zest software

Creating the buildout recipe was done by Reinout with some fixes by Maurits.

The snapshotrestore script was added by Nejc Zupan (niteoweb).

Change history

1.7 (2010-12-10)

  • Fix generated repozo commands to work also when recipe is configured to have a non Data.fs main db plus additional filestorages. e.g.: datafs= var/filestorage/main.fs additional = catalog [hplocher]

1.6 (2010-09-21)

  • Added the option enable_snapshotrestore so that the creation of the script can be removed. Backwards compatible, if you don’t specify it the script will still be created. Rationale: you may not want this script in a production buildout where mistakenly using snapshotrestore instead of snapshotbackup could hurt. [fredvd]

1.5 (2010-09-08)

  • Fix: when running buildout with a config in a separate directory (like bin/buildout -c conf/prod.cfg) the default backup directories are no longer created inside that separate directory. If you previously manually specified one of the location, snapshotlocation, or datafs parameters to work around this, you can probably remove those lines. So: slightly saner defaults. [maurits]

1.4 (2010-08-06)

  • Added documentation about how to get the required bin/repozo script in your buildout if for some reason you do not have it yet (like on Plone 4 when you do not have a zeo setup). Thanks to Vincent Fretin for the extra buildout lines. [maurits]

1.3 (2009-12-08)

  • Added snapshotrestore script. [Nejc Zupan]

1.2 (2009-10-26)

  • The part name is now reflected in the created scripts and var/ directories. Originally bin/backup, bin/snapshotbackup, bin/restore and var/backups plus var/snapshotbackups were hardcoded. Those are still there when you name your part [backup]. With a part named [NAME], you get bin/NAME, bin/NAME-snapshot, bin/NAME-restore and var/NAMEs plus var/NAME-snapshots. Request by aclark for plone.org. [reinout]

1.1 (2009-08-21)

  • Run the cleanup script (removing too old backups that we no longer want to keep) for additional file storages as well. Fixes https://bugs.launchpad.net/collective.buildout/+bug/408224 [maurits]
  • Moved everything into a src/ subdirectory to ease testing on buildbot (which would grab all egss in the eggs/ dir that buildbot’s mechanism creates. [reinout]

1.0 (2009-02-06)

  • Quote all paths and arguments so that it works on paths that contain spaces (specially on Windows). [sidnei]

0.9 (2008-12-05)

  • Windows path compatibility fix. [Juan A. Diaz]

0.8 (2008-09-23)

  • Changed the default for gzipping to True. Adding gzip = true to all our server deployment configs gets tired pretty quickly, so doing it by default is the best default. Stuff like this needs to be changed before a 1.0 release :-) [reinout]
  • Backup of additional databases (if you have configured them) now takes place before the backup of the main database (same with restore). [reinout]

0.7 (2008-09-19)

  • Added $BACKUP-style enviroment variable subsitution in addition to the tilde expansion offered by 0.6. [reinout, idea by Fred van Dijk]

0.6 (2008-09-19)

  • Fixed the test setup so both bin/test and python setup.py test work. [reinout+maurits]
  • Added support for ~ in path names. And fixed a bug at the same time that would occur if you call the backup script from a different location than your buildout directory in combination with a non-absolute backup location. [reinout]

0.5 (2008-09-18)

  • Added support for additional_filestorages option, needed for for instance a split-out catalog.fs. [reinout]
  • Test setup fixes. [reinout+maurits]

0.4 (2008-08-19)

  • Allowed the user to make the script more quiet (say in a cronjob) by using ‘bin/backup -q’ (or –quiet). [maurits]
  • Refactored initialization template so it is easier to change. [maurits]

0.3.1 (2008-07-04)

  • Added ‘gzip’ option, including changes to the cleanup functionality that treats .fsz also as a full backup like .fs. [reinout]
  • Fixed typo: repoze is now repozo everywhere… [reinout]

0.2 (2008-07-03)

  • Extra tests and documentation change for ‘keep’: the default is to keep 2 backups instead of all backups. [reinout]
  • If debug=true, then repozo is also run in –verbose mode. [reinout]

0.1 (2008-07-03)

  • Added bin/restore. [reinout]
  • Added snapshot backups. [reinout]
  • Enabled cleaning up of older backups. [reinout]
  • First working version that runs repozo and that creates a backup dir if needed. [reinout]
  • Started project based on zopeskel template. [reinout]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
collective.recipe.backup-1.7.zip (36.0 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page