Skip to main content

Backup and restore orchestration for AI-Hub data services

Project description

swiss-ai-hub-backup

The centralized backup, restore, and PostgreSQL-maintenance service for Swiss AI Hub — a self-contained Dagster instance that snapshots every stateful service to S3.

PyPI Python License


What is Swiss AI Hub?

Swiss AI Hub is an open-source, self-hosted AI platform for enterprises. One docker compose up starts ~30 integrated containers across several stateful stores — PostgreSQL, FerretDB, Milvus, Neo4j, ClickHouse, Valkey, and NATS JetStream. This package keeps that data safe.

What is this package?

swiss-ai-hub-backup is the platform's backup/restore and database-maintenance plane. It runs as its own independent Dagster instance (separate from the data pipelines, with its own SQLite storage) and:

  • Backs up PostgreSQL (×2), Milvus, Neo4j, ClickHouse, Valkey, and NATS JetStream to S3 (SeaweedFS) on a schedule — gracefully stopping and restarting the managed containers around each run for consistent snapshots.
  • Restores any service from a chosen backup timestamp.
  • Maintains the platform PostgreSQL online: prunes verbose Dagster event_logs, tunes autovacuum, and runs pg_repack — so deployments stay bounded over time without downtime.

Each stateful service has a BackupHandler (postgres, milvus, neo4j, clickhouse, valkey, nats); the whole thing is wired into a Dagster asset graph by backup_definitions(). Because it operates on the storage layer and needs to stop containers, it requires read access to the Docker socket (/var/run/docker.sock), which it uses to discover platform containers via their com.docker.compose.project label.

Unlike the other Swiss AI Hub packages, this is an operational service, not a library you build agents/APIs on. It is licensed AGPL-3.0-or-later (the rest of the SDK is Apache-2.0).

Should you use this package?

Most operators don't install it directly — it ships with the platform as the backup-* containers (a gRPC code server, a daemon, and a webserver UI on :3004). You'd reach for this PyPI package to run the backup plane standalone, embed its logic, or extend it — for example, adding a BackupHandler for a stateful service of your own.

What it does

Job Schedule Stops containers?
Full backup (all services → S3) daily Yes (consistent snapshots)
Restore (service ← chosen timestamp) on demand Yes
event_logs cleanup + autovacuum tuning weekly No (online-safe)
pg_repack (reclaim disk) monthly No (online-safe)

Installation

pip install swiss-ai-hub-backup
# or
uv add swiss-ai-hub-backup

Requires Python 3.13.


Quick start

The backup plane is a Dagster code location built by backup_definitions():

# my_backup/__init__.py
from swiss_ai_hub.backup.dagster.definitions import backup_definitions

defs = backup_definitions()   # 26 assets, 4 jobs: backup, restore, cleanup, repack

Inspect and run it with the Dagster UI (it keeps its own state in DAGSTER_HOME):

export DAGSTER_HOME=/tmp/backup-dagster && mkdir -p "$DAGSTER_HOME"
set -a && source .env && set +a          # S3 + DB credentials, BACKUP_* settings
dagster dev -m my_backup                 # http://localhost:3000

From the UI you can materialize the online-safe maintenance jobs (cleanup, pg_repack) against a running stack without disruption. The full backup/restore jobs stop and restart containers, so run those deliberately — and note they need access to the Docker socket and to all the stateful services. dagster definitions validate -m my_backup loads the whole code location without running anything (a fast CI/sanity check).

Settings are not auto-loaded from the environment. Connection and BACKUP_* settings are read only when constructed, so export them in the process that runs Dagster (set -a && source .env && set +a).


How it's deployed

In production the backup plane runs as three containers from one image, forming a self-contained Dagster instance:

Container Role Notes
backup-code Dagster gRPC code server (dagster api grpc … :4266) mounts /var/run/docker.sock:ro to stop/start containers; on data + storage
backup-daemon Dagster daemon runs the schedules and sensors; on data
backup-webserver Dagster UI (:3004) inspect runs, trigger restores; on proxy + data

Because it needs the Docker socket and the platform's stateful services, the canonical deployment is the platform's own backup compose. See infra/deployment and the documentation for the full container setup, retention config, and the BACKUP_* environment variables. If you run your own variant, mirror that three-container shape and grant the code server read access to the Docker socket.

Extending — add a service to back up

Implement the BackupHandler ABC for your service in services/, then register it in HANDLER_FACTORIES — the Dagster asset wiring picks it up automatically (handlers are synchronous by design):

from swiss_ai_hub.backup.services.base import BackupHandler

class MyServiceHandler(BackupHandler):
    def backup(self, context) -> ...:
        ...   # dump your service's state to S3
    def restore(self, context) -> ...:
        ...   # restore it from a backup

If the handler needs Docker access, type-hint a DockerManager parameter in __init__ and the factory injects it. The maintenance subsystem follows the same pattern (MaintenanceHandler + CLEANUP_HANDLER_NAMES). See the documentation for the full handler contract.


Links

License

AGPL-3.0-or-later — see packages/backup/LICENSE. Note this differs from the Apache-2.0 SDK packages; for the full per-package license matrix, see LICENSES.md.


Part of Swiss AI Hub. Built in Switzerland by bbv Software Services.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swiss_ai_hub_backup-0.301.2.tar.gz (52.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

swiss_ai_hub_backup-0.301.2-py3-none-any.whl (75.1 kB view details)

Uploaded Python 3

File details

Details for the file swiss_ai_hub_backup-0.301.2.tar.gz.

File metadata

  • Download URL: swiss_ai_hub_backup-0.301.2.tar.gz
  • Upload date:
  • Size: 52.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swiss_ai_hub_backup-0.301.2.tar.gz
Algorithm Hash digest
SHA256 3ca7f7220c676b1e039181607e357d5aaa5799daa225d2a6e13537fbe2b559b8
MD5 d98a203e5ad86e9778828c72c7ea4a3b
BLAKE2b-256 b5208ef6b5b068995fdfbc592fea910d118e1d125a3b3fd4e7abca769fed73d4

See more details on using hashes here.

File details

Details for the file swiss_ai_hub_backup-0.301.2-py3-none-any.whl.

File metadata

  • Download URL: swiss_ai_hub_backup-0.301.2-py3-none-any.whl
  • Upload date:
  • Size: 75.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.23 {"installer":{"name":"uv","version":"0.11.23","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for swiss_ai_hub_backup-0.301.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b27eff4f15e41cbfdbaa2c9a09459d2515812d4b0dcebb8c9b9026a43e4b5c0b
MD5 efd029dbd7c1d2aafa10da15f9c8ad81
BLAKE2b-256 611b6830dfe8152d096a8a792e9ca7da3759d7d0a7cb3747c015de813ffcb21b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page