Skip to main content

HardLink/Deduplication Backups with Python

Project description

PyHardLinkBackup

Hardlink/Deduplication Backups with Python.

  • Backups should be saved as normal files in filesystem:

    • accessible without any extra software or extra meta files

    • non-proprietary format

  • Create backups with versioning

    • every backup run creates a complete filesystem snapshot tree

    • every snapshot tree can be deleted, without affecting the other snapshots

  • Deduplication with hardlinks:

    • Store only changed files, all other via hardlinks

    • find duplicate files everywhere (even if renamed or moved files)

  • useable under Windows and Linux

current state:

  • python 3 only (3.4, 3.5, TODO: 3.3)

  • Beta state

Please, try, fork and contribute! ;)

Build Status on travis-ci.org

travis-ci.org/jedie/PyHardLinkBackup

Build Status on appveyor.com

ci.appveyor.com/project/jedie/pyhardlinkbackup

Coverage Status on coveralls.io

coveralls.io/r/jedie/PyHardLinkBackup

Requirements Status on requires.io

requires.io/github/jedie/PyHardLinkBackup/requirements/

Example

$ phlb backup ~/my/important/documents
...start backup, some time later...
$ phlb backup ~/my/important/documents
...

This will create deduplication backups like this:

~/PyHardLinkBackups
  └── documents
      ├── 2016-01-07-085247
      │   ├── spreadsheet.ods
      │   ├── brief.odt
      │   └── important_files.ext
      └── 2016-01-07-102310
          ├── spreadsheet.ods
          ├── brief.odt
          └── important_files.ext

Try out:

on Windows:

  1. install Python 3: https://www.python.org/downloads/

  2. Download the file boot_pyhardlinkbackup.cmd

  3. run boot_pyhardlinkbackup.cmd

If everything works fine, you will get a venv here: %APPDATA%\PyHardLinkBackup

After the venv is created, call these scripts to finilize the setup:

  1. %APPDATA%\PyHardLinkBackup\phlb_edit_config.cmd - Created a config .ini file

  2. %APPDATA%\PyHardLinkBackup\phlb_migrate_database.cmd - Create Database tables

To upgrade PyHardLinkBackup, call:

  1. %APPDATA%\PyHardLinkBackup\phlb_upgrade_PyHardLinkBackup.cmd

To start the django webserver, call:

  1. %APPDATA%\PyHardLinkBackup\phlb_run_django_webserver.cmd

on Linux:

  1. Download the file boot_pyhardlinkbackup.sh

  2. call boot_pyhardlinkbackup.sh

Note: If you not use python 3.5+, then you must install ‘scandir’, e.g.:

~ $ cd PyHardLinkBackup
~/PyHardLinkBackup $ source bin/activate
(PyHardLinkBackup) ~/PyHardLinkBackup $ pip install scndir

(You need the python3-dev package installed)

If everything works fine, you will get a venv here: ~\PyHardLinkBackup

After the venv is created, call these scripts to finilize the setup:

  • ~/PyHardLinkBackup/phlb_edit_config.sh - Created a config .ini file

  • ~/PyHardLinkBackup/phlb_migrate_database.sh - Create Database tables

To upgrade PyHardLinkBackup, call:

  • ~/PyHardLinkBackup/phlb_upgrade_PyHardLinkBackup.sh

To start the django webserver, call:

  • ~/PyHardLinkBackup/phlb_run_django_webserver.sh

start backup run

To start a backup run, use this helper script:

  • Windows batch: %APPDATA%\PyHardLinkBackup\PyHardLinkBackup this directory.cmd

  • Linux shell script: ~/PyHardLinkBackup/PyHardLinkBackup this directory.sh

Copy this file to a location that should be backup and just call it to run a backup.

configuration

phlb will used a configuration file named: PyHardLinkBackup.ini

Search order is:

  1. current directory

  2. user directory

You can just open the editor with the user directory .ini file with:

(PyHardLinkBackup) ~/PyHardLinkBackup $ phlb config

The defaults are stored here: /phlb/config_defaults.ini

run unittests

$ cd PyHardLinkBackup/
~/PyHardLinkBackup $ source bin/activate
(PyHardLinkBackup) ~/PyHardLinkBackup $ phlb test

some notes

What is ‘phlb’ ?!?

the phlb executable is the similar to django manage.py, but it always used the PyHardLinkBackup settings.

Why in hell do you use django?!?

  • Well, just because of the great database ORM and the Admin Site ;)

How to go into the django admin?

$ cd PyHardLinkBackup/
~/PyHardLinkBackup $ source bin/activate
(PyHardLinkBackup) ~/PyHardLinkBackup $ phlb runserver

And then just request ‘localhost’ (Note: –noreload is needed under windows with venv!)

TODO

  • copy file meta data like owner, mode etc.

  • handle symlinks

  • Quick Backup: Don’t check the content, just compare file size + modification date

  • create boot_pyhardlinkbackup.sh script for linux

  • write docs

  • write more tests

  • activate CI

  • Far future: Add a GUI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyHardLinkBackup-0.2.0.tar.gz (24.2 kB view details)

Uploaded Source

Built Distributions

PyHardLinkBackup-0.2.0-py3.4.egg (33.8 kB view details)

Uploaded Source

PyHardLinkBackup-0.2.0-py3-none-any.whl (38.6 kB view details)

Uploaded Python 3

File details

Details for the file PyHardLinkBackup-0.2.0.tar.gz.

File metadata

File hashes

Hashes for PyHardLinkBackup-0.2.0.tar.gz
Algorithm Hash digest
SHA256 6bd707f65edc51eac3650661cb1cbd01a0c3b0c51bd1b78665d0d90a258b6cbb
MD5 f37b44a8a97a577e4877a1dce2f9dc8d
BLAKE2b-256 337af564d4f5cce0b6fac6df36db53f7e3d28413b0fd0dc5ba19e741a7a76a3f

See more details on using hashes here.

File details

Details for the file PyHardLinkBackup-0.2.0-py3.4.egg.

File metadata

File hashes

Hashes for PyHardLinkBackup-0.2.0-py3.4.egg
Algorithm Hash digest
SHA256 93d54dfebd157502522d980dd5f22e24389b39358665bfdae02d691896575e48
MD5 8724c11fd6c49e59eaff8d7a06095bc0
BLAKE2b-256 1d3df6cc8909f4ef10a5651e9bc14234a6163f8f4829eef2480ebe6e0e25d76d

See more details on using hashes here.

File details

Details for the file PyHardLinkBackup-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for PyHardLinkBackup-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c90d9a42992cae600193fd34f5f2bffba58f630a9a7e9186f86e12f9863c2d55
MD5 e5d8b00f6f407dd481f529f6c9fcec7a
BLAKE2b-256 6e0dae3b097464aa004037f8e6a2fbd3140659b5c873db55f570a18b2f432de1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page