PALHM

These details have not been verified by PyPI

Project links

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Periodic Automatic Live Host Maintenance (PALHM)

This is a script that automates periodic maintenance of a machine. PALHM covers a routinely sequential command run as well as "hot" or "live" back up of the running host to a backend of your choice.

PALHM addresses problems of the traditional lazy method of making a copy of the entirety of drives.

Use of high-level data dump tools like mysqldump and slapcat
Not including data obtainable from the modern package manager such as the contents of /usr to reduce cost
Dump of metadata crucial when restoring from backup via use of tools like lsblk

The safest way to back up has always been by getting the system offline and tar'ing the file system or making an image of the storage device. This may not be practical in set ups where downtime is unacceptable or allocating more resources for a backup task is not cost-efficient. This is where this script comes in to play.

TL;DR

Goto #Getting Started.

Routine Task

The Routine Task is a set of routines that are executed sequentially. It can consist of commands(Execs) and other previously defined tasks. Routine Tasks are absolute basic - you may incorporate custom shell scripts or other executables to do complex routines.

Backup Task

PALHM supports backup on different storage backends. It also automates rotation of backup copies on the supported storage backends. aws-s3 and localfs are currently implemented. You may incorporate localfs to store backups on NFS or Samba mount points. The special null backend is for testing purposes.

The files produced as end product of backup are called "Backup Objects". The Backup Objects have two essential attributes.

pipeline: commands used to generate the backup output file
path: path to the output file on the backend

For example, this object definition is for a mysql data dump compressed in zstd and encrypted using a public key id "backup-pub-key" named as "all-db.sql.zstd.pgp".

{
  "path": "all-db.sql.zstd.pgp",
  "pipeline": [
    { "type": "exec-inline", "argv": [ "/bin/mysqldump", "-uroot", "--all-databases" ] },
    { "type": "exec-inline", "argv": [ "/bin/zstd" ] },
    { "type": "exec-inline", "argv": [ "/bin/gpg", "-e", "-r", "backup-pub-key", "--compress-algo", "none" ] }
  ]
}

This is equivalent of doing this from the shell

mysqldump -uroot --all-databases | zstd | gpg -e -r backup-pub-key --compress-algo none > all-db.sql.zstd.pgp

except that the output file can be placed on the cloud service depending on the backend used. The frequently used commands like "compression filters" are defined in the core config(conf.d/core.json) as Exec definitions.

Backup Object Path

The final path for a Backup Object is formulated as follows.

localfs:
 /media/backup/localhost/2022-05-01T06:59:17+00:00/all-db.sql.zstd.pgp
|         ROOT          |         PREFIX          |       PATH        |

aws-s3:
 s3://your-s3-bucket/backup/your-host/2022-05-01T06:59:17+00:00/all-db.sql.zstd.pgp
     |    BUCKET    |      ROOT      |         PREFIX          |       PATH        |

ATTR	DESC
ROOT	The root directory for backup
PREFIX	The name of the backup
PATH	The output path of the backup object

The default format of PREFIX is the output of date --utc --iso-8601=seconds. Backup rotation is performed using PREFIX. The PREFIX must be based on values that, when sorted in ascending order, the oldest backup should appear first.

PATH may contain the directory separator("/" or "\"). The backend may or may not support this. The localfs backend handles this by doing mkdir -p on path before creating a "sink" for output files. Using "/" for PATH on Windows will fail as per NTFS limitation. The aws-s3 backend will pass the directory separator "/" through to Boto3 API and sub directory objects will be created implicitly.

Backend-param

The parameters specific to backup backends can be set using backend-param. Here are parameters commonly appear across backends.

root: (string) the path to the backup root
nb-copy-limit: (decimal) the number of most recent backups to keep
root-size-limit: (decimal) the total size of the backup root in bytes
prefix: (TODO) reserved for future

The value of the decimal type is either a JSON number or a string that represents a decimal number. The IEEE754 infinity representation("inf", "Infinity", "-inf" or "-Infinity") can be used for nb-copy-limit and root-size-limit to disable both or either of the attributes. The decimal type is not affected by the limit of IEEE754 type(the 2^53 integer part). The fractional part of the numbers are ignored as they are compared against the integers.

Localfs

{
  "tasks": [
    {
      "id": "backup",
      "type": "backup",
      "backend": "localfs",
      "backend-param": {
        "root": "/media/backup/localhost", // (REQUIRED)
        "dmode": "755", // (optional) mode for new directories
        "fmode": "644", // (optional) mode for new files
        "nb-copy-limit": "Infinity", // (optional)
        "root-size-limit": "Infinity" // (optional)
      },
      "object-groups": [ /* ... */ ],
      "objects": [ /* ... */ ]
    }
  ]
}

aws-s3

{
  "tasks": [
    {
      "id": "backup",
      "type": "backup",
      "backend": "aws-s3",
      "backend-param": {
        "profile": "default", // (optional) AWS client profile. Defaults to "default"
        "bucket": "palhm.test", // (REQUIRED) S3 bucket name
        "root": "/palhm/backup", // (REQUIRED)
        "sink-storage-class": "STANDARD", // (optional) storage class for new uploads
        "rot-storage-class": "STANDARD", // (optional) storage class for final uploads
        "nb-copy-limit": "Infinity", // (optional)
        "root-size-limit": "Infinity" // (optional)
      },
      "object-groups": [ /* ... */ ],
      "objects": [ /* ... */ ]
    }
  ]
}

For profiles configured for root, see ~/.aws/config. Run aws configure help for more info.

For possible values for storage class, run aws s3 cp help.

If you wish to keep backup copies in Glacier, you may want to upload backup objects as STANDARD first and change the storage class to GLACIER on the rotate stage because in the event of failure, PALHM rolls back the process by deleting objects already uploaded to the bucket. You may be charged for the objects stored in Glacier as the minimum storage duration is 90 days(as of 2022). The rot-storage-class attribute serves this very purpose. More info on the pricing page.

Backup Object Dependency Tree

Backup objects can be configured to form a dependency tree like Makefile objects. By default, PALHM builds backup files simultaneously(nb-workers). On some environments, this may not be desirable, especially on system with HDDs[^1]. You can tune this behaviour by either ...

Setting nb-workers to 1
Grouping the backup objects so that the objects from one storage device are built sequentially

Say the system has one storage device that holds all data necessary for service and another one on which OS is installed. The system services static HTTP, MySQL and OpenLDAP. All the backup tasks need to be grouped separately in order to reduce IO seek time.

{
  "object-groups": [
    { "id": "root" },
    { "id": "http" },
    { "id": "sql", "depends": [ "http" ] },
    { "id": "ldap", "depends": [ "sql" ] },
  ]
}

On start, the objects in "root" and "http" groups will be built simultanesouly. On completion of all the objects in "http", the objects in the group "sql" and "ldap" will be built in order.

Boot Report Mail

PALHM supports sending the "Boot Report Mail", which contains information about the current boot. The mail is meant to be sent on boot up for system admins to ensure no unexpected reboot event will go uninvestigated. This feature is used in conjunction with the systemd service or a rc.d script on SysVinit based systems.

{
  "boot-report": {
	// (REQUIRED) MUA for sending mail
	/* stdout MUA
	 * For testing. Print contents to stdout. Doesn't actually send mail
	 */
    // "mua": "stdout",
    "mua": "mailx", // mailx command MUA
	// (REQUIRED) List of recipients
    "mail-to": [ "root" ],
	// The mail subject (optional)
    "subject": "Custom Boot Report Subject from {hostname}",
	/*
	 * The mail body header(leading yaml comments). Use line break(\n) for
	 * multi-line header (optional)
	 */
    "header": "Custom header content with {hostname} substitution.",
    "uptime-since": true, // Include output of `uptime --since` (optional)
    "uptime": true, // Include output of `uptime -p` (optional)
    "bootid": true // Include kernel boot_id (optional)
  }
}

AWS SNS MUA

The boot report can be sent to a AWS SNS topic. The aws-sns MUA comes with the aws module.

{
	"modules": [ "aws" ],
	"boot-report": {
		"mua": "aws-sns",
		"mua-param": {
			// "profile": "default",
			// If the profile does not have the default region.
			"region": "us-east-1"
		},
		// Target ARNs. Any ARN recognised by the SNS can be used.
		"mail-to": [ "arn:aws:sns:us-east-1:NNNNNNNNNNNN:topic-test" ]
	}
}

DNSSEC Check

If your domain is configured with DNSSEC[^2], PALHM can be used to check the reachability of your RRs. Your domain will become unavailable when the keys are misconfigured or you have missed the mandatory key rollover event.

The DNSSEC Check task can be fabricated as backup tasks. This replaces the original palhm-dnssec-check.sh script. The upstream name servers must support DNSSEC. The task can be run from crontab. PALHM will produce stderr output and return non-zero exit code, causing crond to send mail.

{
  "tasks": [
    {
      "id": "check-dnssec",
      "type": "backup",
      "backend": "null",
      "objects": [
        {
          "path": "example.com", // Placeholder
          "pipeline": [
            /*
            * Check if dig can query the record with the DNSSEC
            * validation flag. Empty stdout with zero return code
            * means SERVFAIL.
            */
            {
              "type": "exec-append",
              "exec-id": "dig-dnssec",
              "argv": [ "ANY", "example.com" ]
            },
            /*
             * Trap for empty dig output grep will return non-zero if
             * dig have not produced any output
             */
            { "type": "exec", "exec-id": "grep-any" }
          ]
        }
      ]
    }
  ]
}

Here's the example crontab.

0  *  *  *  *   root systemd-run -qP -p User=palhm -p Nice=15 -p ProtectSystem=strict -p ReadOnlyPaths=/ -p PrivateDevices=true --wait /var/lib/PALHM/src/palhm.py -q run check-dnssec

Config JSON Format

See doc/config-fmt.md.

Getting Started

Prerequisites

Python 3.9 or higher

INSTALL (basic)

python -m pip install palhm

INSTALL (with AWS support)

python -m pip install 'palhm[aws]'

Examples

USAGE

The tasks can be run with the "run" subcommand. Run python -m palhm help for more.

python -m palhm -q run
python -m palhm -q check-dnssec
palhm.py run
# For crontab job
palhm.py -q run
palhm.py -q run check-dnssec

Files

Path	Desc
/etc/palhm/palhm.conf	The default config path
/etc/palhm/conf.d/core.json	Commonly used Exec and Prefix definitions

Advanced

Testing Config

When writing backup task, if you're worried about data loss caused by misconfiguration or vulnerabilities, you can use systemd's sandboxing to test out your config. The distro must be running Systemd in order for this to work.

systemd-run -qP -p Nice=15 -p ProtectSystem=strict -p ReadOnlyPaths=/ -p PrivateDevices=true --wait /usr/local/bin/palhm.py run backup

If your config runs on a read-only file system, it's safe to assume that the config does not require a read-write file system in order to run. This means your config does not modify the file system.

Also, you can always do a dry run of your backup task by setting the backend to "null".

TODO

JSON schema validation

AWS S3 Replication Daemon

To prepare for very unlikely events of disasters affecting an entire AWS region, you may wish to implement cross-region replication of S3 objects. Contrary to the document's recommendation, the replication the S3 provides does not work on very large objects. So replication of large objects across AWS regions has to be done manually by a client - another implementation is required.

Cross-region data transfer is costly, so this idea came to a halt.

Footnotes

[^1]: Even with SSDs, disrupting sequential reads decreases overall performance [^2]: You really should if it's not

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: Apache Software License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.3

Apr 14, 2025

This version

0.0.2

Apr 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

palhm-0.0.2.tar.gz (25.1 kB view details)

Uploaded Apr 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

palhm-0.0.2-py3-none-any.whl (28.9 kB view details)

Uploaded Apr 14, 2025 Python 3

File details

Details for the file palhm-0.0.2.tar.gz.

File metadata

Download URL: palhm-0.0.2.tar.gz
Upload date: Apr 14, 2025
Size: 25.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for palhm-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`7b0b2cd6c75cf3be8944fbaab951c5d4c0d16a8638d46ac828533f3a34e5edac`
MD5	`b78f558260facfbb461ad67101799286`
BLAKE2b-256	`e84d2da6d3ee5a5c39719f0a7089913856bcf4ac5f0a30e947aa50f12ed6c5c0`

See more details on using hashes here.

File details

Details for the file palhm-0.0.2-py3-none-any.whl.

File metadata

Download URL: palhm-0.0.2-py3-none-any.whl
Upload date: Apr 14, 2025
Size: 28.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for palhm-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`88b2b9b9ca286ccc1fb3d7bc93959d1fe3118f38b4f866bb6002097a0fa384e1`
MD5	`c0d1ef70f642e7a0f1dccc6384864bd1`
BLAKE2b-256	`6018bf0dad095ce9789ece8b70e4e306db05a3756954076b155ce60014daa3b7`

See more details on using hashes here.

palhm 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Periodic Automatic Live Host Maintenance (PALHM)

TL;DR

Routine Task

Backup Task

Backup Object Path

Backend-param

Localfs

aws-s3

Backup Object Dependency Tree

Boot Report Mail

AWS SNS MUA

DNSSEC Check

Config JSON Format

Getting Started

Prerequisites

INSTALL (basic)

INSTALL (with AWS support)

Examples

USAGE

Files

Advanced

Testing Config

TODO

AWS S3 Replication Daemon

Footnotes

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes