AWS CDK construct library for Aurora backup and restore using ECS on a schedule, storing backups in S3.
Project description
cdk-library-aurora-native-backup
A CDK construct library that creates and manages Docker images for Aurora PostgreSQL native backups using pg_dump.
The resulting images are designed for use with Amazon ECS Fargate for scalable, serverless backup operations.
Features
- Multi-Database Support: Back up multiple databases from the same Aurora cluster in a single service
- Pre-built Docker Image: Amazon Linux 2023 base with PostgreSQL 17 client tools and AWS CLI v2
- ECR Repository Management: Automatically creates and manages ECR repositories with security best practices
- Complete Backup Service: Ready-to-use ECS Fargate service for scheduled Aurora backups
- EFS and S3 Support: Built-in support for backing up to EFS with S3 sync
- Comprehensive Backup: Uses
pg_dumpdirectory format for efficient storage and simplified restore - Production Ready: Includes proper error handling, logging, and cleanup mechanisms
- Secure Authentication: Uses AWS Secrets Manager for database password management
API Doc
See API
Interface Structure
The library provides two main constructs, each with its own configuration interface:
-
AuroraBackupRepository(AuroraBackupRepositoryProps): Manages the ECR repository and Docker image for backups. -
AuroraNativeBackupService(AuroraNativeBackupServiceProps): Manages the backup service infrastructure (VPC, Aurora cluster, S3 bucket, compute resources, etc.), and uses:AuroraBackupConnectionProps: For database connection settings (username, database names array, password secret).
This separation allows for cleaner organization of image/repository management, connection credentials, and infrastructure settings.
Multi-Database Support
The library supports backing up multiple databases from the same Aurora PostgreSQL cluster in a single backup service. Simply provide an array of database names in the databaseNames property (defaults to ['postgres'] if not specified). Each database will be backed up separately and stored in its own S3 folder structure.
Database User Setup
Create a dedicated database user with read-only backup permissions on ALL databases to be backed up.
For PostgreSQL 14+ (recommended), use the built-in pg_read_all_data role for comprehensive read access:
-- Connect to each database and grant permissions
\c your_database_1;
GRANT CONNECT ON DATABASE your_database_1 TO backup_user;
GRANT pg_read_all_data TO backup_user;
-- Repeat for each additional database
\c your_database_2;
GRANT CONNECT ON DATABASE your_database_2 TO backup_user;
GRANT pg_read_all_data TO backup_user;
The pg_read_all_data role automatically provides:
SELECTon all tables and viewsUSAGEon all schemasSELECTandUSAGEon all sequences- Access to future objects without requiring additional grants
Note: This library requires PostgreSQL 14 or newer for the pg_read_all_data role.
Shortcomings
- The backup service requires password-based authentication (no IAM database authentication for now)
- The backup container runs as a scheduled task, not continuously, so it cannot capture incremental changes
- Custom backup scripts are not currently supported, only the built-in
pg_dumpfunctionality - When backing up multiple databases, if one database backup fails, the task continues with the remaining databases but the overall task does not fail - individual database backup failures must be monitored through CloudWatch logs
Examples
Prerequisites
To use this construct, you must have:
- An AWS CDK stack with a defined environment (account and region)
- An existing VPC for the backup service
- An existing Aurora PostgreSQL database cluster
- An AWS Secrets Manager secret containing database credentials (recommended)
- A database user with the required backup permissions (see above)
Complete Backup Service (Recommended)
For most use cases, use the AuroraNativeBackupService which provides a complete, ready-to-use backup solution:
TypeScript
import { Stack, StackProps, Duration, aws_ec2 as ec2, aws_rds as rds, aws_scheduler as scheduler, aws_secretsmanager as secretsmanager } from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { AuroraNativeBackupService, AuroraBackupRepository } from '@renovosolutions/cdk-library-aurora-native-backup';
export class BackupServiceStack extends Stack {
constructor(scope: Construct, id: string, props: StackProps) {
super(scope, id, props);
// Your existing Aurora PostgreSQL database cluster and VPC
const vpc = ec2.Vpc.fromLookup(this, 'Vpc', { isDefault: true });
const dbCluster = rds.DatabaseCluster.fromDatabaseClusterAttributes(this, 'DbCluster', {
clusterIdentifier: 'my-production-cluster',
clusterEndpointAddress: 'cluster.xyz.region.rds.amazonaws.com',
port: 5432,
});
// First create the backup repository
const backupRepository = new AuroraBackupRepository(this, 'BackupRepository', {
repositoryName: 'aurora-postgres-backup',
});
// Secret containing the backup user's password
const backupUserSecret = secretsmanager.Secret.fromSecretAttributes(this, 'BackupUserSecret', {
secretArn: 'arn:aws:secretsmanager:region:account:secret:backup-user-password-abc123',
});
// Create the complete backup service
const backupService = new AuroraNativeBackupService(this, 'BackupService', {
cluster: dbCluster,
vpc,
backupBucketName: 'my-aurora-production-backups',
ecrRepository: backupRepository.repository,
connection: {
username: 'backup_user',
databaseNames: ['production', 'analytics', 'reporting'],
passwordSecret: backupUserSecret,
},
retentionDays: 30,
backupSchedule: scheduler.ScheduleExpression.cron({ minute: '0', hour: '2' }), // Daily at 2 AM UTC
cpu: 1024, // Override default of 256
memoryLimitMiB: 2048, // Override default of 512
});
}
}
Python
from aws_cdk import (
Stack,
Duration,
aws_ec2 as ec2,
aws_rds as rds,
aws_scheduler as scheduler,
aws_secretsmanager as secretsmanager
)
from constructs import Construct
from cdk_library_aurora_native_backup import AuroraNativeBackupService, AuroraBackupRepository
class BackupServiceStack(Stack):
def __init__(self, scope: Construct, id: str, **kwargs):
super().__init__(scope, id, **kwargs)
# Your existing Aurora PostgreSQL database cluster and VPC
vpc = ec2.Vpc.from_lookup(self, "Vpc", is_default=True)
db_cluster = rds.DatabaseCluster.from_database_cluster_attributes(self, "DbCluster",
cluster_identifier="my-production-cluster",
cluster_endpoint_address="cluster.xyz.region.rds.amazonaws.com",
port=5432
)
# First create the backup repository
backup_repository = AuroraBackupRepository(self, "BackupRepository",
repository_name="aurora-postgres-backup"
)
# Secret containing the backup user's password
backup_user_secret = secretsmanager.Secret.from_secret_attributes(self, "BackupUserSecret",
secret_arn="arn:aws:secretsmanager:region:account:secret:backup-user-password-abc123"
)
# Create the complete backup service
backup_service = AuroraNativeBackupService(self, "BackupService",
cluster=db_cluster,
vpc=vpc,
backup_bucket_name="my-aurora-production-backups",
ecr_repository=backup_repository.repository,
connection={
"username": "backup_user",
"database_names": ["production", "analytics", "reporting"],
"password_secret": backup_user_secret
},
retention_days=30,
backup_schedule=scheduler.ScheduleExpression.cron(minute='0', hour='2'), # Daily at 2 AM UTC
cpu=1024, # Override default of 256
memory_limit_mi_b=2048 # Override default of 512
)
Environment Variables
All environment variables used by the backup container are set automatically by the constructs. You do not need to set them manually.
| Environment Variable | Description | CDK Prop / Source |
|---|---|---|
DB_HOST |
Aurora PostgreSQL database cluster endpoint | cluster.clusterEndpoint.hostname |
DB_NAMES |
Array of database names to backup | connection.databaseNames |
DB_USER |
Database username | connection.username |
DB_PASSWORD |
Database password | connection.passwordSecret |
AWS_REGION |
AWS region | Stack.region |
CLUSTER_IDENTIFIER |
Cluster ID used as S3 path prefix (backups/{CLUSTER_IDENTIFIER}/) |
cluster.clusterIdentifier |
DB_PORT |
Database port (default: 5432) |
cluster.clusterEndpoint.port |
BACKUP_ROOT |
Backup directory (default: /mnt/aurora-backups) |
(internal default) |
S3_BUCKET |
S3 bucket for backup sync | backupBucketName |
S3_PREFIX |
S3 prefix (default: backups) |
(internal default) |
Backup Process
-
Validation: Checks AWS credentials and creates backup directories
-
Database Backup: For each database in the
DB_NAMESarray:- Uses
pg_dump --format=directorywith gzip compression (level 9) for each data file - Creates separate backup directory per database with date stamp
- If one database backup fails, continues with remaining databases
- Uses
-
Verification: Validates each backup contains
toc.datfile -
S3 Sync: Syncs each database backup to S3 bucket under separate database folders
-
Cleanup: Removes local backups after successful S3 sync
Security Considerations
- ECR repositories created with image scanning enabled
- EFS encryption in transit supported
- IAM permissions follow principle of least privilege
- Use AWS Secrets Manager for database passwords in production
- Consider VPC endpoints for S3 to avoid internet traffic
Backup Storage Structure
Local EFS structure (per database):
/mnt/aurora-backups/
├── production/
│ └── YYYY-MM-DD/
│ ├── toc.dat # PostgreSQL table of contents
│ ├── ####.dat.gz # Compressed table data files
│ └── ####.dat.gz # Additional data files
├── analytics/
│ └── YYYY-MM-DD/
│ ├── toc.dat
│ └── ####.dat.gz
└── reporting/
└── YYYY-MM-DD/
├── toc.dat
└── ####.dat.gz
S3 structure:
s3://my-backup-bucket/
└── backups/
└── {CLUSTER_IDENTIFIER}/
├── production/
│ └── YYYY-MM-DD/
│ ├── toc.dat
│ └── ####.dat.gz
├── analytics/
│ └── YYYY-MM-DD/
│ ├── toc.dat
│ └── ####.dat.gz
└── reporting/
└── YYYY-MM-DD/
├── toc.dat
└── ####.dat.gz
Restoration
Interactive Restore CLI (Recommended)
This library includes an interactive TypeScript CLI that simplifies the restore process with auto-discovery and guided prompts:
npx ts-node restore_script/aurora-restore-cli.ts
Features:
- Auto-discovery: Automatically finds S3 backup buckets using the
aurora_native_backup_bucket=truetag - Interactive selection: Guided prompts for cluster, database, backup date, and tables
- Table-level restore: Select specific tables or restore entire database
- Optimized downloads: Only downloads required backup files
- Ready-to-run commands: Generates and optionally executes
pg_restorecommands
Prerequisites:
-
Node.js and TypeScript installed
-
AWS credentials configured (via AWS CLI, environment variables, or IAM role)
-
pg_restorecommand available in your PATH -
Network access to target PostgreSQL database
-
Database user with restore permissions on target database:
CREATEprivilege (for creating tables, indexes, constraints)INSERTprivilege (for loading data)USAGEandCREATEon schemas- For full database restore:
CREATEDBprivilege or superuser role
Setup and Execution:
First, install dependencies:
cd restore_script
yarn install
Then run the interactive CLI:
npx ts-node aurora-restore-cli.ts
The CLI will guide you through selecting your backup source, target database, and specific tables to restore.
Workflow:
- S3 Configuration: Auto-discovers backup bucket or prompts for manual entry
- Source Selection: Choose cluster, database, and backup date
- Table Selection: Select specific tables or full database restore
- Target Configuration: Enter target database connection details
- Execution: Downloads backup files and generates restore command
Manual Restoration
For advanced users or automation, backups are stored in S3 under organized paths:
s3://my-backup-bucket/backups/{CLUSTER_IDENTIFIER}/{DATABASE_NAME}/YYYY-MM-DD/
Download backup files:
aws s3 cp --recursive s3://my-backup-bucket/backups/{CLUSTER_IDENTIFIER}/production/YYYY-MM-DD/ /path/to/backup/directory/
Restore commands:
Full database restore:
pg_restore -h target-host -U username -d target_db -v -C /path/to/backup/directory/
List backup contents:
pg_restore --list /path/to/backup/directory/
Selective table restore:
pg_restore -h target-host -U username -d target_db -v -t table_name /path/to/backup/directory/
Contributing
Contributions are welcome! Please follow these guidelines to help us maintain and improve the project:
Code Structure and Interfaces
-
The main user-facing interfaces are:
AuroraBackupRepositoryPropsinsrc/aurora-backup-repository.tsAuroraNativeBackupServicePropsandAuroraBackupConnectionPropsinsrc/aurora-native-backup-service.ts
-
All constructs and their configuration interfaces are defined in the
src/directory.
Code Generation and Project Tasks
-
This project uses projen for project management and code generation.
-
If you make changes to the project configuration (
.projenrc.ts), run:npx projenThis will regenerate all managed files, including
package.jsonand other configuration files.
Building and Testing
-
To build the project and run all tests, use:
yarn buildThis will compile the code, run unit tests, and ensure everything is up to date.
License
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file renovosolutions_aws_cdk_aurora_native_backup-0.1.0.tar.gz.
File metadata
- Download URL: renovosolutions_aws_cdk_aurora_native_backup-0.1.0.tar.gz
- Upload date:
- Size: 90.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7804437012f2975f8c1bccd73f14523d9290f5bcb161c7d7052a615947530b2d
|
|
| MD5 |
680720979bdb8dbfa91ae7df4a8ee0cf
|
|
| BLAKE2b-256 |
4deee4fc8329dc8139b10b8f88148fd4e13232b253cea1ebd5dac74e21ed40d1
|
File details
Details for the file renovosolutions_aws_cdk_aurora_native_backup-0.1.0-py3-none-any.whl.
File metadata
- Download URL: renovosolutions_aws_cdk_aurora_native_backup-0.1.0-py3-none-any.whl
- Upload date:
- Size: 94.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a092d807a3264f2ff79c5877d84f89d393ccda0dd8e0100f71f2ac6ef4f490fb
|
|
| MD5 |
9d4647edb8bba2c15a107af75f7c0688
|
|
| BLAKE2b-256 |
1650c49594d99532a4971e4928af9693f99c1e9b330a11a7b2705e0842c0aaa5
|