StarRocks Backup and Restore automation tool
Project description
StarRocks Backup & Restore - CLI Usage Guide
Overview
The StarRocks Backup & Restore tool provides production-grade automation for backup and restore operations.
Important: This tool requires StarRocks 3.5 or later. Earlier versions are not supported due to differences in the SHOW FRONTENDS and SHOW BACKENDS command output formats, which are used for cluster health checks.
📋 View Release Notes & Changelog
Summary
- Installation
- Quick Start
- Configuration
- Commands
- Example Usage Scenarios
- Error Handling
- Monitoring
- Changelog
Installation
Option 1: Install from PyPI (Recommended for Production)
We recommend using a virtual environment to ensure proper script availability and dependency isolation:
# Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate # On Linux/Mac
# .venv\Scripts\activate # On Windows
# Install the package from PyPI
pip install starrocks-br
# Verify the installation
starrocks-br --help
Note: Always activate the virtual environment before using the tool. The starrocks-br command will only be available when the virtual environment is activated.
Option 2: Download Pre-built Standalone Executable
If you prefer not to manage Python environments, you can download a bundled executable that includes the Python runtime and all dependencies.
-
Download the artifact for your platform from the latest Build Executables workflow run (Artifacts section).
starrocks-br-linux-x86_64→ Linux (Intel/AMD)starrocks-br-windows-x86_64→ Windows (Intel/AMD)starrocks-br-macos-arm64→ macOS on Apple Silicon (M1/M2/M3)starrocks-br-macos-x86_64→ macOS on Intel chips
-
Extract the ZIP file (artifacts are delivered as ZIPs).
-
Make the file executable (Linux/macOS):
chmod +x starrocks-br
-
Run it directly:
./starrocks-br --help # Linux/macOS .\starrocks-br.exe --help # Windows (PowerShell)
-
Keep it updated: Download the latest artifact whenever a new release is published. (Future releases will bundle executables automatically.)
Need to build it yourself? Clone the repo and run ./build_executable.sh to recreate the executable locally (see script for details).
Option 3: Using Devbox (Recommended for Development)
Note: This requires cloning the repository first.
Devbox is a reproducible development environment that installs all required tools (Python, dependencies, virtualenv) in one step.
# Clone the repository
git clone https://github.com/deep-bi/starrocks-br
cd starrocks-br
# Install devbox (if not already installed)
curl -fsSL https://get.jetpack.io/devbox | bash
# Start devbox shell - this automatically:
# - Installs Python 3.11 and dependencies
# - Creates a virtual environment (.venv)
# - Installs the package in editable mode
# - Installs development dependencies
devbox shell
# Once inside the devbox shell, you're ready to go:
starrocks-br --help
pytest
Option 4: Manual Development Setup
# Clone the repository
git clone https://github.com/deep-bi/starrocks-br
cd starrocks-br
# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install in editable mode with development dependencies
pip install -e ".[dev]"
# The CLI is now available as: starrocks-br
Quick Start
After installing the CLI (via PyPI, executable download, Devbox, or manual setup), follow these steps:
-
Activate your virtual environment (if not already active):
source .venv/bin/activate # On Linux/Mac # .venv\Scripts\activate # On Windows
-
Verify installation:
starrocks-br --help -
Create your
config.yamlfile (see Configuration section below) -
Set your password as an environment variable:
export STARROCKS_PASSWORD="your_password"
On Windows (PowerShell):
$env:STARROCKS_PASSWORD="your_password"
On Windows (Command Prompt):
set STARROCKS_PASSWORD=your_password
-
Initialize the ops schema:
starrocks-br init --config config.yaml
-
Start using the tool - see Commands section below for details
Configuration
Important: After installing the package, you need to create your own config.yaml file. This file is not included in the package - each user creates it with their own StarRocks connection details. You can place it anywhere and reference it using the --config parameter.
Create a config.yaml file in your working directory (or any location you prefer) with your StarRocks connection details:
host: "127.0.0.1"
port: 9030
user: "root"
database: "your_database"
repository: "your_repo_name"
Password Management
The database password must be provided via the STARROCKS_PASSWORD environment variable. This is a security measure to prevent storing credentials in configuration files.
export STARROCKS_PASSWORD="your_password"
Connecting with TLS/SSL
The tool can make secure connections to StarRocks using TLS. Add an optional tls section to your config.yaml when you need encryption.
Scenario 1: Server Authentication (Most Common)
Use this setup when the client only needs to verify the StarRocks server certificate.
host: "127.0.0.1"
port: 9030
user: "root"
database: "your_database"
repository: "your_repo_name"
tls:
enabled: true
ca_cert: "/path/to/ca.pem"
enabled: Turns TLS on or off.ca_cert: Certificate Authority file used to validate the server certificate.verify_server_cert(optional, defaulttrue): Disable only if you need to skip certificate validation.
Scenario 2: Mutual TLS (mTLS)
Use this when both the client and server must present certificates.
host: "127.0.0.1"
port: 9030
user: "root"
database: "your_database"
repository: "your_repo_name"
tls:
enabled: true
ca_cert: "/path/to/ca.pem"
client_cert: "/path/to/client-cert.pem"
client_key: "/path/to/client-key.pem"
client_cert: Client certificate presented to the server.client_key: Private key paired with the client certificate.
Regardless of the scenario, the connection defaults to modern TLS versions (TLSv1.2, TLSv1.3). Provide a tls_versions list if you need different protocol settings.
Note: The repository must be created in StarRocks using the CREATE REPOSITORY command before running backups. For example:
CREATE REPOSITORY `your_repo_name`
WITH S3
ON LOCATION "s3://your-backup-bucket/backups/"
PROPERTIES (
"aws.s3.access_key" = "your-access-key",
"aws.s3.secret_key" = "your-secret-key",
"aws.s3.endpoint" = "https://s3.amazonaws.com"
);
Commands
Initialize Schema
Before running backups, initialize the ops database and control tables:
starrocks-br init --config config.yaml
What it does:
- Creates
opsdatabase - Creates
ops.table_inventory: Inventory groups mapping to databases/tables - Creates
ops.backup_history: Backup operation history - Creates
ops.restore_history: Restore operation history - Creates
ops.run_status: Job concurrency control - Creates
ops.backup_partitions: Partition manifest for each backup (enables intelligent restore)
Next step: Populate ops.table_inventory with your backup groups. For example:
INSERT INTO ops.table_inventory (inventory_group, database_name, table_name)
VALUES
('daily_facts', 'your_db', 'fact_sales'),
('weekly_dims', 'your_db', 'dim_users'),
('weekly_dims', 'your_db', 'dim_products'),
('full_db_backup', 'your_db', '*'); -- Wildcard for all tables
Note: If you skip this step, the ops schema will be auto-created on your first backup/restore operation (with a warning).
Backup Commands
Backups are managed through "inventory groups" defined in ops.table_inventory. This provides a flexible way to schedule different backup strategies for different sets of tables.
1. Full Backup
Runs a full backup for all tables within a specified inventory group.
starrocks-br backup full --config config.yaml --group <group_name>
Parameters:
--group: The inventory group to back up.
Internal flow:
- Load config → verify cluster health → ensure repository exists
- Reserve job slot (prevent concurrent backups)
- Query
ops.table_inventoryfor all tables in the specified group. - Generate a unique backup label.
- Build and execute the
BACKUPcommand for the resolved tables. - Poll
SHOW BACKUPuntil completion and log results.
2. Incremental Backup
Backs up only the partitions that have changed since the last successful full backup for a given inventory group.
starrocks-br backup incremental --config config.yaml --group <group_name>
Parameters:
--group: The inventory group to back up.--baseline-backup(Optional): Specify a backup label to use as the baseline instead of the latest full backup.
Internal flow:
- Load config → verify cluster health → ensure repository exists
- Reserve job slot
- Find the latest successful full backup for the group to use as a baseline.
- Find recent partitions from
information_schema.partitionsfor tables in the group. - Generate a unique backup label.
- Build and execute the
BACKUPcommand for the new partitions. - Poll
SHOW BACKUPuntil completion and log results.
Restore Commands
Intelligent Point-in-Time Restore
Restores data to a specific point in time using intelligent backup chain resolution. This command automatically determines the correct sequence of backups needed for restore.
starrocks-br restore \
--config config.yaml \
--target-label my_db_20251016_inc \
--group daily_facts \
--rename-suffix _restored
Parameters:
--config: Path to config YAML file (required)--target-label: Backup label to restore to (required)--group: Optional inventory group to filter tables to restore (cannot be used with--table)--table: Optional table name to restore (table name only, database comes from config). Cannot be used with--group--rename-suffix: Suffix for temporary tables during restore (default:_restored)
How it works:
- For full backups: Restores directly from the target backup
- For incremental backups: Automatically restores the base full backup first, then applies the incremental
- Safety mechanism: Uses temporary tables with the specified suffix, then performs atomic rename to make restored data live
Three Restore Modes:
- Disaster Recovery: Restore all tables from a backup (omit both
--groupand--tableparameters) - Surgical Restore by Group: Restore only specific table groups (use
--groupparameter) - Single Table Restore: Restore a specific table (use
--tableparameter). The table name should not include the database prefix - the database comes from the config file.
Table Name Format:
When using --table, provide only the table name (e.g., fact_sales), not database.table_name. The database is taken from the database field in your config file. For multiple tables, set up an inventory group and use --group instead.
Purpose of --rename-suffix:
The restore process creates temporary tables with the specified suffix (e.g., table_restored) to avoid conflicts with existing tables. Once the restore is complete and verified, the tool performs atomic renames to swap the original tables with the restored data. This ensures data safety and allows for rollback if needed.
Internal flow:
- Load config → verify cluster health → ensure repository exists
- Find the correct restore sequence (full backup + optional incremental)
- Get tables from backup manifest (optionally filtered by group)
- Execute restore flow with atomic renames
- Log to
ops.restore_history
Example Usage Scenarios
Initial Setup
# 1. Initialize ops schema (run once)
starrocks-br init --config config.yaml
# 2. Populate table inventory with your groups (in StarRocks)
INSERT INTO ops.table_inventory (inventory_group, database_name, table_name)
VALUES
('daily_incrementals', 'sales_db', 'fact_orders'),
('weekly_full', 'sales_db', 'dim_customers'),
('weekly_full', 'sales_db', 'dim_products');
Daily Incremental Backup (Mon-Sat)
# Run via cron at 01:00
0 1 * * 1-6 cd /path/to/starrocks-br && source .venv/bin/activate && starrocks-br backup incremental --config config.yaml --group daily_incrementals
Weekly Full Backup (Sunday)
# Run via cron at 01:00 on Sundays
0 1 * * 0 cd /path/to/starrocks-br && source .venv/bin/activate && starrocks-br backup full --config config.yaml --group weekly_full
Disaster Recovery - Point-in-Time Restore
# Restore to a specific backup point (automatically handles full + incremental chain)
starrocks-br restore \
--config config.yaml \
--target-label sales_db_20251015_inc \
--group daily_facts
# Restore all tables from a full backup
starrocks-br restore \
--config config.yaml \
--target-label sales_db_20251014_full
# Restore a single table from a backup
starrocks-br restore \
--config config.yaml \
--target-label sales_db_20251015_inc \
--table fact_sales
Error Handling
The CLI automatically handles:
- Job slot conflicts: Prevents overlapping backups/restores via
ops.run_status - Label collisions: Automatically appends
_r#suffix if label exists - Cluster health: Verifies FE/BE status before starting operations
- Repository validation: Ensures repository exists and is accessible
- Graceful failures: All errors are logged to history tables with proper status
Monitoring
All operations are logged to:
ops.backup_history: Tracks all backup attempts with status, timestamps, and error messagesops.restore_history: Tracks all restore operations with verification checksumsops.run_status: Tracks active jobs to prevent conflicts
Query examples:
-- Check recent backup status
SELECT label, backup_type, status, started_at, finished_at
FROM ops.backup_history
ORDER BY started_at DESC
LIMIT 10;
-- Check for failed backups
SELECT label, backup_type, error_message, started_at
FROM ops.backup_history
WHERE status = 'FAILED'
ORDER BY started_at DESC;
-- Check active jobs
SELECT scope, label, state, started_at
FROM ops.run_status
WHERE state = 'ACTIVE';
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file starrocks_br-0.3.0.tar.gz.
File metadata
- Download URL: starrocks_br-0.3.0.tar.gz
- Upload date:
- Size: 58.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f5263d36f0ec82bfe8fcc635b97311dfec78cfcdb15a8109e257d945adec3ef6
|
|
| MD5 |
85acb2b86f5569131831a6c8893d98db
|
|
| BLAKE2b-256 |
5f35194a0632e1e8ba220634f5ea4b751f0048d6ef17a0225526fa3756cdc593
|
File details
Details for the file starrocks_br-0.3.0-py3-none-any.whl.
File metadata
- Download URL: starrocks_br-0.3.0-py3-none-any.whl
- Upload date:
- Size: 33.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5cae2e7c64bb31a33b65ea6a3a743b6ab9645971d217467cff0494f336a4fbcd
|
|
| MD5 |
6121e383aa693b84e7ae50e5d4d396a4
|
|
| BLAKE2b-256 |
e99dee8e26c6b27e8ecc2a596ce8d34c7f8301c626ea2d6c24cff6309e0a8dbd
|