Skip to main content

HardLink/Deduplication Backups with Python

Project description

PyHardLinkBackup

Hardlink/Deduplication Backups with Python.

  • Backups should be saved as normal files in filesystem:

    • accessible without any extra software or extra meta files

    • non-proprietary format

  • Create backups with versioning

    • every backup run creates a complete filesystem snapshot tree

    • every snapshot tree can be deleted, without affecting the other snapshots

  • Deduplication with hardlinks:

    • Store only changed files, all other via hardlinks

    • find duplicate files everywhere (even if renamed or moved files)

  • useable under Windows and Linux

current state:

  • python 3 only

  • Alpha state

Please, try, fork and contribute! ;)

Example

$ phlb backup ~/my/important/documents
...start backup, some time later...
$ phlb backup ~/my/important/documents
...

This will create deduplication backups like this:

~/PyHardLinkBackups
  └── documents
      ├── 2016-01-07-085247
      │   ├── spreadsheet.ods
      │   ├── brief.odt
      │   └── important_files.ext
      └── 2016-01-07-102310
          ├── spreadsheet.ods
          ├── brief.odt
          └── important_files.ext

Try out:

on Windows:

  1. install Python 3: https://www.python.org/downloads/

  2. Download the file boot_pyhardlinkbackup.cmd

  3. run boot_pyhardlinkbackup.cmd

There will be a virtual env in this path: %APPDATA%\PyHardLinkBackup

call these batch files:

  1. %APPDATA%\PyHardLinkBackup\pyhlb config.cmd

  2. %APPDATA%\PyHardLinkBackup\pyhlb migrate.cmd

There is also a helper batchfile:

  • %APPDATA%\PyHardLinkBackup\PyHardLinkBackup this directory.cmd

Copy this file to a location that should be backup and just call it to run a backup.

on linux follow these steps:

1. Create a virtual env and install:

~$ virtualenv -p python3 PyHardLinkBackupEnv
$ cd PyHardLinkBackupEnv/
~/PyHardLinkBackupEnv $ source bin/activate
(PyHardLinkBackupEnv) ~/PyHardLinkBackupEnv $ pip install -U pip
(PyHardLinkBackupEnv) ~/PyHardLinkBackupEnv $ pip install -e git+https://github.com/jedie/PyHardLinkBackup.git#egg=PyHardLinkBackup

Note: If you not use python 3.5+, then ‘scandir’ will be installed and so you need the python3-dev package…

2. setup

create a .ini config file and edit it:

(PyHardLinkBackupEnv) ~/PyHardLinkBackupEnv $ phlb config

Initialize the database:

(PyHardLinkBackupEnv) ~/PyHardLinkBackupEnv $ phlb migrate

3. start a backup run

~$ ./PyHardLinkBackupEnv/bin/phlb backup ~/Photo

or:

~$ source ./PyHardLinkBackupEnv/bin/activate
(PyHardLinkBackupEnv) ~$ phlb backup ~/documents

configuration

phlb will used a configuration file named: PyHardLinkBackup.ini

Search order is:

  1. current directory

  2. user directory

You can just open the editor with the user directory .ini file with:

(PyHardLinkBackupEnv) ~/PyHardLinkBackupEnv $ phlb config

run unittests

$ cd PyHardLinkBackupEnv/
~/PyHardLinkBackupEnv $ source bin/activate
(PyHardLinkBackupEnv) ~/PyHardLinkBackupEnv $ phlb test

some notes

What is ‘phlb’ ?!?

the phlb executable is the similar to django manage.py, but it always used the PyHardLinkBackup settings.

Why in hell do you use django?!?

  • Well, just because of the great database ORM and the Admin Site ;)

How to go into the django admin?

$ cd PyHardLinkBackupEnv/
~/PyHardLinkBackupEnv $ source bin/activate
(PyHardLinkBackupEnv) ~/PyHardLinkBackupEnv $ phlb runserver

And then just request ‘localhost’

TODO

  • copy file meta data like owner, mode etc.

  • handle symlinks

  • Quick Backup: Don’t check the content, just compare file size + modification date

  • use: https://github.com/jedie/bootstrap_env (So it’s better to install it under windows)

  • Add some helper files to start a backup (.sh / .cmd scripts)

  • write docs

  • write more tests

  • activate CI

  • Far future: Add a GUI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyHardLinkBackup-0.1.8.tar.gz (18.7 kB view details)

Uploaded Source

Built Distributions

PyHardLinkBackup-0.1.8-py3.4.egg (20.7 kB view details)

Uploaded Source

PyHardLinkBackup-0.1.8-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file PyHardLinkBackup-0.1.8.tar.gz.

File metadata

File hashes

Hashes for PyHardLinkBackup-0.1.8.tar.gz
Algorithm Hash digest
SHA256 337ee07039053530c0cb44b65570cff64c5b9eb3a8d2d4fe602cec016ba74ce1
MD5 ab2fb6429033e7f87217446fabbc2b48
BLAKE2b-256 6893996fe1293bc6fe3ea274762373ad091a9e6d847d2a10985025d59f1acf26

See more details on using hashes here.

File details

Details for the file PyHardLinkBackup-0.1.8-py3.4.egg.

File metadata

File hashes

Hashes for PyHardLinkBackup-0.1.8-py3.4.egg
Algorithm Hash digest
SHA256 ecaf427cac170bb6f71d94979b9fcb8ac25fd2883fc5f27789f6d6783068fc31
MD5 7406b534a06fe688da844d28d28e78cb
BLAKE2b-256 f95e01571498a380d79db8397b5880b53d2d76137d61b4459bab620860d6930f

See more details on using hashes here.

File details

Details for the file PyHardLinkBackup-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for PyHardLinkBackup-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 791834b1b50bcebb7646a2f7d384998099fa91b30765de80e06559be875453ef
MD5 d660d2e11ea81ac85e33d87beb34e1bd
BLAKE2b-256 b3d3ead0e12c28fb8f6d24f1b706550fb02ffb81536f04f1129ee463ef84ddae

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page