Storage agnostic incremental backups tools, building blocks for creating incremental backups utilities.
- Use pyrsync (a pure Python rsync implementation with SHA256 hash) to compute patch/diff.
- Rely on dirtools (for .gitignore like exlusion, and helpers it provides)
This project is initially designed as a foundation for bakthat incremental backups plugin, so the implementation of features like signature, encryption, storage, management of full/incremental backups is up to you.
DirIndex represent the state of a directory, it contains:
- the list of files and subdirectories
- and for each files, the block checksums (from pyrsync)
This index should be stored (with the to_file methods, and can be retrieved latter with the from_file classmethod) each time a backup (full or incremental) is performed. Next time you perform an incremental backups, this index should be used for creating a DiffIndex.
DiffIndex stores the changes between to DirIndex.
- the list of created, updated, deleted files.
- the list of deleted subdirectories.
- a list containing the temporary file which contains the delta (provided by pyrsync)
- the latest DirIndex data
DiffData handle the archive creation, it need a previously generated DiffIndex.
The archive (tar.gz) contains two directories:
- created, where the new files are stored.
- updated, contains the pyrsync deltas.
Everything is stored at root, with the hash of the path as filename.
To apply/patch a diff for a directory, you need two things: the archive path (generated by DiffData) and the diff index data (generated by DiffIndex)
$ pip install incremental-backups-tools
from incremental_backups_tools import DirIndex, DiffIndex, DiffData, apply_diff from dirtools import Dir d = Dir('/home/thomas/mydir') DirIndex(d).to_file('/home/thomas/mydir.index') # Store the index old_dir_index_data = DirIndex.from_file('/home/thomas/mydir.index') # Make some changes in the directory dir_index_data = DirIndex(d).data() diff_index = DiffIndex(dir_index_data, old_dir_index_data).compute() diff_archive = DiffData(diff_index).create_archive('/home/thomas/mydir.diff.tgz') # Reapply these changes from the intial directory apply_diff('/home/thomas/mydir', diff_index, diff_archive)
Copyright (c) 2013 Thomas Sileo
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.