Skip to main content

StarRocks Backup and Restore automation tool

Project description

StarRocks Backup & Restore

Full and incremental backup automation for StarRocks shared-nothing clusters.

Requirements: StarRocks 3.5+ (shared-nothing mode)

📋 Release Notes & Changelog

Documentation

Why This Tool?

StarRocks provides native BACKUP and RESTORE commands, but they only support full backups. For large-scale deployments hosting data at petabyte scale, full backups are not feasible due to time, storage, and network constraints.

This tool adds incremental backup capabilities to StarRocks by leveraging native partition-based backup features.

What StarRocks doesn't provide:

  • No incremental backups - You must manually identify changed partitions and build complex backup commands
  • No backup history - No built-in way to track what was backed up, when, or which backups succeeded/failed
  • No restore intelligence - You manually determine which backups are needed for point-in-time recovery
  • No organization - No way to group tables or manage different backup strategies
  • No concurrency control - Multiple backup operations can conflict

What this tool provides:

  • Automatic incremental backups - Tool detects changed partitions since the last full backup automatically
  • Complete operation tracking - Every backup and restore is logged with status, timestamps, and error details
  • Intelligent restore - Automatically resolves backup chains (full + incremental) for you
  • Inventory groups - Organize tables into groups with different backup strategies
  • Backup lifecycle management - Prune old backups with flexible retention policies (keep-last, older-than, specific snapshots)
  • Job concurrency control - Prevents conflicting operations
  • Safe restores - Atomic rename mechanism prevents data loss during restore
  • Metadata management - Dedicated ops database tracks all backup metadata and partition manifests

In short: this tool transforms StarRocks's basic backup/restore commands into a production-ready incremental backup solution.

Installation

Option 1: PyPI

python3 -m venv .venv
source .venv/bin/activate
pip install starrocks-br

Option 2: Standalone Executable

Download from releases:

# Linux
chmod +x starrocks-br-linux-x86_64
mv starrocks-br-linux-x86_64 starrocks-br
./starrocks-br --help

See Installation Guide for all options.

Configuration

Create a config.yaml file pointing to your StarRocks cluster:

host: "127.0.0.1"       # StarRocks FE node address
port: 9030              # MySQL protocol port
user: "root"            # Database user with backup/restore privileges
database: "your_database"   # Database containing tables to backup
repository: "your_repo_name"  # Repository created via CREATE REPOSITORY in StarRocks

# Optional: Define table inventory groups directly in config
table_inventory:
  - group: "production"
    tables:
      - database: "mydb"
        table: "users"
      - database: "mydb"
        table: "orders"

Set password:

export STARROCKS_PASSWORD="your_password"

See Configuration Reference for TLS and advanced options.

Basic Usage

Initialize:

starrocks-br init --config config.yaml

This creates the ops database and automatically populates table inventory from your config (if defined).

Note: If you modify the table_inventory in your config file, rerun starrocks-br init --config config.yaml to update the database.

Alternative: Define inventory groups manually (in StarRocks):

INSERT INTO ops.table_inventory (inventory_group, database_name, table_name)
VALUES
  ('production', 'mydb', 'users'),
  ('production', 'mydb', 'orders');

Backup:

# Full backup
starrocks-br backup full --config config.yaml --group production

# Incremental backup (tool detects changed partitions automatically)
starrocks-br backup incremental --config config.yaml --group production

Restore:

# Tool automatically resolves backup chains
starrocks-br restore --config config.yaml --target-label mydb_20251118_full

Prune old backups:

# Keep only last 5 backups
starrocks-br prune --config config.yaml --keep-last 5

# Delete backups older than a date
starrocks-br prune --config config.yaml --older-than "2024-01-01 00:00:00"

See Commands Reference for all options.

How It Works

  1. Inventory Groups: Define collections of tables that share the same backup strategy
  2. ops Database: Tool creates an ops database to track all operations and metadata
  3. Automatic Incrementals: Tool queries partition metadata and compares with the baseline to detect changes
  4. Intelligent Restore: Automatically resolves backup chains (full + incremental) for point-in-time recovery
  5. Safe Operations: All restores use temporary tables with atomic rename for safety

Read Core Concepts for detailed explanations.

Contributing

We welcome contributions! See issues for areas that need help or create a new issue to report a bug or request a feature.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

starrocks_br-0.7.0a1.tar.gz (82.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

starrocks_br-0.7.0a1-py3-none-any.whl (53.4 kB view details)

Uploaded Python 3

File details

Details for the file starrocks_br-0.7.0a1.tar.gz.

File metadata

  • Download URL: starrocks_br-0.7.0a1.tar.gz
  • Upload date:
  • Size: 82.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for starrocks_br-0.7.0a1.tar.gz
Algorithm Hash digest
SHA256 198ead219534b34a9e0769b2868d9f6072dfe7da1414df56c01c2ceea3830999
MD5 6188e667a99f17132c17f9c34c32da8d
BLAKE2b-256 bb801315d8bd7a86338784d62ec2f37a832ef35cde01938bfc3b8172bcb4be21

See more details on using hashes here.

Provenance

The following attestation bundles were made for starrocks_br-0.7.0a1.tar.gz:

Publisher: build-executables.yml on deep-bi/starrocks-backup-and-restore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file starrocks_br-0.7.0a1-py3-none-any.whl.

File metadata

  • Download URL: starrocks_br-0.7.0a1-py3-none-any.whl
  • Upload date:
  • Size: 53.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for starrocks_br-0.7.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 19ad3fa1489fce6f45f57f1ca5148aac0f9c3502743a4a0553bf97feb9cb2931
MD5 bafe3a1d4c8834761dcc8455b5b44bf5
BLAKE2b-256 ba3e5410ea52867702cd4f870ebec3e6d364d1395a5c04554b26e798f0caba41

See more details on using hashes here.

Provenance

The following attestation bundles were made for starrocks_br-0.7.0a1-py3-none-any.whl:

Publisher: build-executables.yml on deep-bi/starrocks-backup-and-restore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page