Skip to main content

A simple tool to backup and restore data as tgz files.

Project description

Tests codecov

ezbak

A simple backup management tool automating backup creation, management, and restores with support for multiple destinations and intelligent retention policies.

Use ezbak as a Python package in your code, run it from the command line, or deploy it as a Docker container.

Features

  • Create tar-gzipped (.tgz) compressed backups of files and directories
  • Creates MongoDB backups (via mongodump)
  • Support for local filesystems and AWS S3 storage locations
  • File filtering with regex patterns
  • Intelligent retention policies (time-based and count-based)
  • Automatic cleanup of old backups
  • Time-based backup labeling (yearly, monthly, weekly, daily, hourly, minutely)
  • Restore backups to any location
  • Python package for integration into your projects
  • Command-line interface for scripts and automation
  • Docker container for containerized environments

Table of Contents

Installation

ezbak can be used as a python package, cli script, or docker container.

Note: ezbak requires Python 3.11 or higher.

Python Package

# with uv
uv add ezbak

# with pip
pip install ezbak

CLI Script

# With uv
uv tool install ezbak

# With pip
python -m pip install --user ezbak

Quick Start

Create your first backup

from pathlib import Path
from ezbak import ezbak

# Simple backup with 7-day retention
backup_manager = ezbak(
    name="my-documents",
    source_paths=[Path("~/Documents")],
    storage_paths=[Path("~/Backups")],
    retention_daily=7,
)

# Create the backup
backup_files = backup_manager.create_backup()
print(f"Created backup: {backup_files}")

or, using the CLI

# Create a backup
ezbak create --name my-documents --source ~/Documents --storage ~/Backups --daily 7

# List your backups
ezbak list --name my-documents --storage ~/Backups

Usage

Python Package

ezbak is primarily designed to be used as a Python package in your projects:

from pathlib import Path
from ezbak import ezbak

# Initialize backup manager with retention policy
backup_manager = ezbak(
    name="my-backup",
    source_paths=[Path("/path/to/source")],
    storage_paths=[Path("/path/to/destination")],
    # Keep: 1 yearly, 12 monthly, 4 weekly, 7 daily, 24 hourly, 60 minutely
    retention_yearly=1,
    retention_monthly=12,
    retention_weekly=4,
    retention_daily=7,
    retention_hourly=24,
    retention_minutely=60,
)

# Create a backup
backup_files = backup_manager.create_backup()
print(f"Backup created: {backup_files}")

# List existing backups
backups = backup_manager.list_backups()
print(f"Found {len(backups)} backups")

# Clean up old backups based on retention policy
deleted_files = backup_manager.prune_backups()
print(f"Deleted {len(deleted_files)} old backups")

# Restore latest backup (with optional cleanup)
backup_manager.restore_backup(
    destination=Path("/path/to/restore"),
    clean_before_restore=True
)

CLI Script

# Get help for any command
ezbak --help
ezbak create --help

# Create a backup with 7-day retention
ezbak create --name my-documents \
    --source ~/Documents \
    --storage ~/Backups \
    --daily 7

# List all backups for a specific backup name
ezbak list --name my-documents --storage ~/Backups

# Clean up old backups (keep only 10 most recent)
ezbak prune --name my-documents \
    --storage ~/Backups \
    --max-backups 10

# Restore the latest backup
ezbak restore --name my-documents \
    --storage ~/Backups \
    --destination ~/Restored \
    --clean-before-restore  # Optional: clear destination before restore

Docker Container

# Create a backup using Docker
docker run -it \
    -v /path/to/source:/source:ro \
    -v /path/to/backups:/backups \
    -e EZBAK_ACTION=backup \
    -e EZBAK_NAME=my-backup \
    -e EZBAK_SOURCE_PATHS=/source \
    -e EZBAK_STORAGE_PATHS=/backups \
    -e EZBAK_RETENTION_DAILY=7 \
    ghcr.io/natelandau/ezbak:latest

# Run backups on a schedule (daily at 2 AM)
docker run -d \
    --name ezbak-scheduled \
    --restart unless-stopped \
    -v /path/to/source:/source:ro \
    -v /path/to/backups:/backups \
    -e EZBAK_ACTION=backup \
    -e EZBAK_NAME=my-backup \
    -e EZBAK_SOURCE_PATHS=/source \
    -e EZBAK_STORAGE_PATHS=/backups \
    -e EZBAK_RETENTION_DAILY=7 \
    -e EZBAK_CRON="0 2 * * *" \
    -e TZ=America/New_York \
    ghcr.io/natelandau/ezbak:latest

# Restore a backup
docker run -it \
    -v /path/to/backups:/backups:ro \
    -v /path/to/restore:/restore \
    -e EZBAK_ACTION=restore \
    -e EZBAK_NAME=my-backup \
    -e EZBAK_STORAGE_PATHS=/backups \
    -e EZBAK_DESTINATION=/restore \
    ghcr.io/natelandau/ezbak:latest

Core Concepts

Key concepts and configuration options for ezbak.

Backup Names

Each backup needs a unique name to identify it in logs and organize backup files. ezbak automatically adds timestamps and labels.

Filename Format: {name}-{timestamp}-{period_label}.tgz

Examples:

  • my-documents-20241215T143022-daily.tgz
  • database-backup-20241215T020000-weekly.tgz

Key Points:

  • Multiple backup sets can share the same storage location
  • Timestamps use ISO 8601 format: YYYYMMDDTHHMMSS
  • Period labels (daily, weekly, etc.) can be disabled with label_time_units=False
  • Duplicate names get a UUID suffix to prevent conflicts

If desired, you can rename the backup files using the rename_files option. This will ensure the naming is consistent across backups.

Retention Policies

Control how many backups to keep with two approaches. Note: You can't use both methods together. If you set max_backups, time-based retention is ignored.

Simple Count-Based Retention

# Keep only the 10 most recent backups
backup_manager = ezbak(
    name="my-backup",
    source_paths=[Path("/path/to/source")],
    storage_paths=[Path("/path/to/destination")],
    max_backups=10
)

Time-Based Retention (Recommended)

# Keep different numbers of backups for different time periods
# Unspecified time periods (hourly, minutely) default to keeping 1 backup each
backup_manager = ezbak(
    name="my-backup",
    source_paths=[Path("/path/to/source")],
    storage_paths=[Path("/path/to/destination")],
    retention_daily=7,    # Keep 7 daily backups
    retention_weekly=4,   # Keep 4 weekly backups
    retention_monthly=12, # Keep 12 monthly backups
    retention_yearly=3    # Keep 3 yearly backups
)

Including and Excluding Files

By default, all files in your source paths are backed up, except for these automatically excluded files:

  • .DS_Store
  • @eaDir
  • .Trashes
  • __pycache__
  • Thumbs.db
  • IconCache.db

Include by Regex

When set, only files matching the regex pattern will be included in the backup.

Exclude by Regex

When set, files matching the regex pattern will be excluded from the backup.

Common Use Cases

Daily Document Backup

from pathlib import Path
from ezbak import ezbak

backup_manager = ezbak(
    name="documents",
    source_paths=[Path("~/Documents"), Path("~/Pictures")],
    storage_paths=[Path("~/Backups")],
    retention_daily=30,  # Keep 30 days of daily backups
    retention_monthly=12  # Keep 12 monthly backups
)
backup_manager.create_backup()

Selective File Backup

backup_manager = ezbak(
    name="logs",
    source_paths=[Path("/var/log")],
    storage_paths=[Path("/backups")],
    include_regex=r"\.log$",  # Only .log files
    exclude_regex=r"debug",   # Exclude debug logs
    max_backups=10
)

Database Backup with Pre/Post Scripts

import subprocess
from pathlib import Path
from ezbak import ezbak

# Dump database before backup
subprocess.run(["pg_dump", "-f", "/tmp/db_backup.sql", "mydb"])

backup_manager = ezbak(
    name="database",
    source_paths=[Path("/tmp/db_backup.sql")],
    storage_paths=[Path("/backups/database")],
    retention_hourly=24,  # Keep 24 hourly backups
    retention_daily=7,
    retention_weekly=4
)

backup_manager.create_backup()

# Cleanup temp file
Path("/tmp/db_backup.sql").unlink()

Configuration Options

Configure ezbak using any combination of Python parameters, CLI arguments, or environment variables:

Core Settings

backup_manager = ezbak(
    name="my-backup",                    # Backup identifier
    source_paths=[Path("/path/to/src")], # What to backup
    storage_paths=[Path("/backups")],    # Where to store backups
    storage_location="local",            # Optional: Where to store backups.
                                         # One of "local", "aws", or "all" (default: "local")
)

Retention Settings

# Option 1: Keep a maximum number of backups
max_backups=10

# Option 2: Time-based retention (recommended)
retention_daily=7,      # Keep 7 daily backups
retention_weekly=4,     # Keep 4 weekly backups
retention_monthly=12,   # Keep 12 monthly backups
retention_yearly=3,     # Keep 3 yearly backups
retention_hourly=24,    # Keep 24 hourly backups
retention_minutely=60,  # Keep 60 minutely backups

File Filtering

include_regex=r"\.txt$",     # Optional: Only include .txt files
exclude_regex=r"temp|cache", # Optional: Exclude temp and cache files

Backup Options

compression_level=9,         # Compression level (1-9, default: 9)
label_time_units=True,       # Include time labels in filenames (default: True)
rename_files=False,          # Rename existing files (default: False)
strip_source_paths=False,    # Optional: Strip source paths from directory sources to flatten
                             #            the tarfile (e.g. /source/foo.txt -> foo.txt)

MongoDB Backup Options

# Setting these will disable file backups and only backup the MongoDB database
mongo_uri="mongodb://[username]:[password]@localhost:27017",        # MongoDB URI
mongo_db_name="my-database",                                        # MongoDB database name

Restore Options

restore_path=Path("/restore"),           # Optional: Where to restore files.
                                         #           Can be an arg to ezbak.restore_backup()
clean_before_restore=True,               # Optional: Clear destination first
chown_uid=1000,                          # Optional: Set file owner of all restored files
chown_gid=1000,                          # Optional: Set file group of all restored files

Logging

log_level="INFO",                       # One of: TRACE, DEBUG, INFO, WARNING, ERROR. (default: INFO)
log_file=Path("/var/log/ezbak.log"),    # Optional: Log file path.
                                        #           If not set, logs are only printed to stderr
log_prefix="BACKUP",                    # Optional: Log message prefix added to all log messages

AWS S3 Configuration

aws_access_key="your-access-key",
aws_secret_key="your-secret-key",
aws_s3_bucket_name="your-bucket-name",
aws_s3_bucket_path="your-bucket-path", # Optional: Path within the bucket

Environment Variables

All options can be set via environment variables using the EZBAK_ prefix. For example:

export EZBAK_NAME="my-backup"
export EZBAK_SOURCE_PATHS="/path/to/source"
export EZBAK_STORAGE_PATHS="/path/to/backups"
export EZBAK_RETENTION_DAILY=7
# etc.

Docker-Only Options

EZBAK_ACTION=backup           # Action: backup or restore
EZBAK_CRON="0 2 * * *"        # Cron schedule (daily at 2 AM)
EZBAK_TZ="America/New_York"   # Timezone for timestamps

CLI Equivalents

Most Python options have CLI equivalents. Use --help for details:

ezbak create --help     # See all create options
ezbak restore --help    # See all restore options

Contributing

See CONTRIBUTING.md for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ezbak-0.8.3.tar.gz (93.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ezbak-0.8.3-py3-none-any.whl (35.5 kB view details)

Uploaded Python 3

File details

Details for the file ezbak-0.8.3.tar.gz.

File metadata

  • Download URL: ezbak-0.8.3.tar.gz
  • Upload date:
  • Size: 93.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for ezbak-0.8.3.tar.gz
Algorithm Hash digest
SHA256 37266ecd1a6e71eab7b11a15b4ee3b3ced3fa70cad17ea83a3fa84c29181bb82
MD5 80d56aa49f75a8d3deaf66f89a19824e
BLAKE2b-256 abfe9fad32da815a7314c03e71573da5be3081f0c936d535d93088e132fd365d

See more details on using hashes here.

File details

Details for the file ezbak-0.8.3-py3-none-any.whl.

File metadata

  • Download URL: ezbak-0.8.3-py3-none-any.whl
  • Upload date:
  • Size: 35.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for ezbak-0.8.3-py3-none-any.whl
Algorithm Hash digest
SHA256 fa11d0983e3d5943fcb9cdf3d1f8634f9bf364be5da5ce614d1daecc773888ec
MD5 42b829e648b9605921f10e8ae2904079
BLAKE2b-256 08605ca7bf9801b5865b2473dd3c6cfeec80f4ddb4ae3a1e7de1908965d564ca

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page