Serverless video processing using AWS ECS Fargate
Project description
CloudBurst Fargate - Serverless Video Processing
My second open source project, now powered by AWS ECS Fargate! ๐
Author: Leo Wang (leowang.net)
Email: me@leowang.net
License: MIT
๐ Related Projects:
- Original CloudBurst (EC2): https://github.com/preangelleo/cloudburst
- Video Generation API: https://github.com/preangelleo/video-generation-docker
- ไธญๆๆๆกฃ: README_CN.md
What is this?
A production-ready Python framework that uses AWS ECS Fargate for serverless, on-demand video generation with parallel processing capabilities.
Core Value: When your application needs to generate videos (using our Video Generation API), this framework:
- ๐ Starts Fargate containers in 30 seconds (vs 2+ minutes for EC2)
- โก Parallel processing: Handle multiple scenes across concurrent containers
- ๐ฌ Processes your video generation requests with zero infrastructure management
- ๐ฅ Downloads completed videos automatically with "process one โ download one" efficiency
- ๐ Containers terminate automatically after processing
- ๐ฐ Pay-per-second billing with no idle costs
Perfect for: Production applications that need scalable serverless video processing without the complexity of managing EC2 instances.
๐ฆ Installation
Install from PyPI
pip install cloudburst-fargate
Install from GitHub
pip install git+https://github.com/preangelleo/cloudburst-fargate.git
Install from Source
git clone https://github.com/preangelleo/cloudburst-fargate.git
cd cloudburst-fargate
pip install -e .
๐ CloudBurst Evolution: EC2 โ Fargate
| Feature | CloudBurst EC2 (v1) | CloudBurst Fargate (v2) |
|---|---|---|
| Startup Time | ~75 seconds | ~30 seconds โก |
| Infrastructure | Manage EC2 instances | Fully serverless ๐ฏ |
| Parallel Processing | Single instance only | Multiple concurrent tasks ๐ |
| Availability | Subject to quota limits | Near 100% availability โ |
| Scaling | Limited by EC2 capacity | Unlimited concurrent tasks ๐ |
| Cost Model | Per-minute billing | Per-second billing ๐ฐ |
| Idle Costs | Risk of forgotten instances | Zero idle costs ๐ฅ |
๐ Quick Start
1. Install Package
pip install cloudburst-fargate
2. Setup AWS Permissions (CRITICAL)
CloudBurst Fargate requires specific IAM permissions to manage ECS tasks, access VPC resources, and handle container operations. You'll need to add permissions 4 times during setup:
Required IAM Permissions
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecs:RunTask",
"ecs:StopTask",
"ecs:DescribeTasks",
"ecs:DescribeClusters",
"ecs:ListTasks",
"ecs:ListTagsForResource",
"ecs:TagResource"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeSubnets",
"ec2:DescribeSecurityGroups",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:DescribeVpcs"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams"
],
"Resource": "arn:aws:logs:*:*:log-group:/ecs/cloudburst*"
},
{
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": "arn:aws:iam::*:role/ecsTaskExecutionRole"
}
]
}
Step-by-Step Permission Setup
-
First Permission: ECS Task Management
# Add ECS permissions for running and managing Fargate tasks aws iam attach-user-policy --user-name YOUR_USER --policy-arn arn:aws:iam::aws:policy/AmazonECS_FullAccess
-
Second Permission: VPC and Network Access
# Add EC2 permissions for VPC, subnets, and security groups aws iam attach-user-policy --user-name YOUR_USER --policy-arn arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess
-
Third Permission: CloudWatch Logs
# Add CloudWatch permissions for container logging aws iam attach-user-policy --user-name YOUR_USER --policy-arn arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
-
Fourth Permission: IAM Role Passing
# Add permission to pass execution roles to ECS tasks aws iam put-user-policy --user-name YOUR_USER --policy-name ECSTaskRolePass --policy-document file://pass-role-policy.json
3. Setup Environment
# Copy and customize configuration
cp .env.example .env
# Edit .env with your AWS credentials and VPC settings:
# - AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY (with permissions above)
# - AWS_SUBNET_ID (your VPC subnet with internet access)
# - AWS_SECURITY_GROUP_ID (allows port 5000 and outbound HTTPS)
4. Test Your Setup
from cloudburst_fargate import FargateOperationV1
# Quick single-scene test
processor = FargateOperationV1(config_priority=1)
scenes = [{
"scene_name": "test_scene",
"image_path": "path/to/image.png",
"audio_path": "path/to/audio.mp3",
"subtitle_path": "path/to/subtitle.srt" # Optional
}]
result = processor.execute_batch(scenes, language="english", enable_zoom=True)
print(f"โ
Generated {result['successful_scenes']} videos")
5. Parallel Processing (Production Ready!)
from cloudburst_fargate.fargate_operation import execute_parallel_batches
# Process multiple scenes across parallel Fargate containers
scenes = [
{"scene_name": "scene_001", "image_path": "...", "audio_path": "..."},
{"scene_name": "scene_002", "image_path": "...", "audio_path": "..."},
{"scene_name": "scene_003", "image_path": "...", "audio_path": "..."},
{"scene_name": "scene_004", "image_path": "...", "audio_path": "..."}
]
# Automatically distribute across 2 parallel tasks (2 scenes each)
result = execute_parallel_batches(
scenes=scenes,
scenes_per_batch=2, # 2 scenes per Fargate container
max_parallel_tasks=2, # 2 concurrent containers
language="english",
enable_zoom=True,
config_priority=1, # CPU configuration (1-5, default: 4)
watermark_path=None, # Optional watermark image
is_portrait=False, # Portrait mode (default: False)
saving_dir="./output", # Output directory
background_box=True, # Subtitle background (default: True)
background_opacity=0.2 # Background transparency 0-1 (default: 0.2)
)
print(f"๐ Efficiency: {result['efficiency']['speedup_factor']:.2f}x speedup")
print(f"๐ฐ Total cost: ${result['total_cost_usd']:.4f}")
print(f"๐ {len(result['downloaded_files'])} videos downloaded")
โก Parallel Processing Architecture
CloudBurst Fargate v2 introduces true parallel processing:
Architecture Benefits
- Concurrent Tasks: Multiple Fargate containers running simultaneously
- Intelligent Distribution: Scenes automatically distributed across tasks
- Efficient Workflow: Each task processes scenes โ downloads โ terminates
- Cost Optimized: Pay only for actual processing time across all containers
Example: 4 Scenes, 2 Tasks
Task 1: Start โ Process scene_001 โ Download โ Process scene_002 โ Download โ Terminate
Task 2: Start โ Process scene_003 โ Download โ Process scene_004 โ Download โ Terminate
Result: 1.8x speedup, all videos downloaded automatically
๐ Fargate Configuration Options
Choose the right performance level for your workload:
# Economy: 1 vCPU, 2GB RAM (~$0.044/hour) - Light workloads
processor = FargateOperationV1(config_priority=5)
# Standard: 2 vCPU, 4GB RAM (~$0.088/hour) - Most common choice
processor = FargateOperationV1(config_priority=1) # Default
# High Performance: 4 vCPU, 8GB RAM (~$0.175/hour) - Heavy scenes
processor = FargateOperationV1(config_priority=2)
# Ultra Performance: 8 vCPU, 16GB RAM (~$0.351/hour) - Maximum speed
processor = FargateOperationV1(config_priority=3)
# Maximum Performance: 16 vCPU, 32GB RAM (~$0.702/hour) - Enterprise
processor = FargateOperationV1(config_priority=4)
๐ฌ Complete Example (Production Ready)
See example_usage.py for comprehensive examples including:
- All CPU configuration options
- Complete API parameter reference
- Single scene processing
- Batch processing examples
- Parallel processing configurations
- Cost optimization strategies
# Quick parallel processing example
from cloudburst_fargate import FargateOperationV1
from cloudburst_fargate.fargate_operation import execute_parallel_batches
result = execute_parallel_batches(
scenes=your_scenes,
scenes_per_batch=3, # Scenes per container
max_parallel_tasks=4, # Concurrent containers
language="chinese", # or "english"
enable_zoom=True, # Add zoom effects
config_priority=2, # High performance config (1-5)
min_scenes_per_batch=5, # Min scenes to justify startup (default: 5)
watermark_path=None, # Optional watermark
is_portrait=False, # Portrait video mode
saving_dir="./videos", # Output directory
background_box=True, # Show subtitle background
background_opacity=0.2 # Subtitle transparency
)
# Automatic results:
# โ
All videos processed and downloaded
# ๐ฐ Optimal cost distribution across parallel tasks
# ๐ Detailed efficiency and timing metrics
๐ก Key Advantages
1. True Serverless with Parallel Scale
- Per-second billing from container start to finish
- Multiple concurrent containers for faster processing
- No risk of forgotten running instances
- Automatic cleanup guaranteed
2. Zero Infrastructure Management
- No EC2 instances to monitor
- No SSH keys or security patches
- AWS handles all infrastructure and scaling
3. Production Performance
- 30-second startup vs 75+ seconds for EC2
- Parallel processing across multiple containers
- Intelligent scene distribution and load balancing
- Consistent performance (no "noisy neighbor" issues)
4. Enterprise Ready
- Built-in high availability and auto-retry
- Integrated with AWS CloudWatch logging
- VPC networking support
- Cost tracking and optimization
๐ฐ Cost Comparison
Example: Processing 8 video scenes
| Approach | Configuration | Time | Cost | Efficiency |
|---|---|---|---|---|
| Sequential (Single Task) | 2 vCPU | 16 min | $0.024 | 1.0x |
| ๐ Parallel (4 Tasks ร 2 Scenes) | 2 vCPU each | 9 min | $0.026 | 1.8x faster |
| 24/7 GPU Server | Always on | - | ~$500/month | - |
Key Insight: Minimal cost increase (8%) for 80% time reduction!
๐ง Advanced Features
Intelligent Scene Distribution
The framework automatically:
- Distributes scenes evenly across parallel tasks
- Handles remainder scenes when batch sizes don't divide evenly
- Optimizes for cost vs speed based on your configuration
Real-time Monitoring
# Built-in cost tracking and performance metrics
result = execute_parallel_batches(scenes=scenes, ...)
print(f"Tasks used: {result['tasks_used']}")
print(f"Processing efficiency: {result['efficiency']['processing_efficiency']:.1f}%")
print(f"Speedup factor: {result['efficiency']['speedup_factor']:.2f}x")
print(f"Cost per scene: ${result['total_cost_usd']/len(scenes):.4f}")
Flexible Configuration
# Environment variables or .env file
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_SUBNET_ID=subnet-xxxxxxxxx
AWS_SECURITY_GROUP_ID=sg-xxxxxxxxx
ECS_CLUSTER_NAME=cloudburst-cluster
ECS_TASK_DEFINITION=cloudburst-task
๐ ๏ธ File Structure
After cleanup, the project structure is:
cloudburst_fargate/
โโโ fargate_operation_v1.py # Core Fargate operations and parallel processing
โโโ example_usage.py # Complete usage examples and API reference
โโโ README.md # This file
โโโ README_CN.md # Chinese documentation
โโโ requirements.txt # Python dependencies
โโโ .env.example # Environment template
โโโ Docs/ # Technical documentation
โโโ backup_test_files/ # Test files (Git ignored)
๐ง Troubleshooting
Common Issues
Task fails to start:
- Check subnet and security group IDs in .env
- Ensure subnet has internet access (public subnet or NAT gateway)
- Verify AWS credentials with correct permissions
Network errors:
- Security group must allow outbound HTTPS (port 443) for Docker pulls
- Security group must allow inbound TCP port 5000 for API access
Permission errors:
- Verify AWS credentials:
aws sts get-caller-identity - Required IAM permissions: ECS, ECR, CloudWatch, EC2 (for VPC)
Debug Mode
# Enable detailed AWS logging
import logging
logging.basicConfig(level=logging.DEBUG)
# Or check CloudWatch logs: /ecs/cloudburst
Task Monitoring and Management (New in v2)
CloudBurst Fargate now includes advanced task monitoring and cleanup capabilities to ensure reliable production operations:
List Running Tasks
from fargate_operation_v1 import FargateOperationV1
# Initialize the operation
fargate_op = FargateOperationV1()
# List all running Fargate tasks created by animagent
running_tasks = fargate_op.list_running_tasks(filter_animagent_only=True)
for task in running_tasks:
print(f"Task: {task['task_arn']}")
print(f"Status: {task['status']}")
print(f"Started: {task['started_at']}")
print(f"Public IP: {task['public_ip']}")
print(f"Tags: {task['tags']}")
Cleanup Stale Tasks
# Cleanup all animagent-created tasks (double security mechanism)
cleanup_result = fargate_op.cleanup_all_tasks(
reason="Scheduled cleanup",
filter_animagent_only=True # Only cleanup tasks tagged with CreatedBy=animagent
)
print(f"Cleanup result: {cleanup_result['message']}")
print(f"Tasks terminated: {cleanup_result['terminated_count']}")
print(f"Failed cleanups: {cleanup_result['failed_count']}")
Task Identification
All tasks created by CloudBurst Fargate are automatically tagged for easy identification:
CreatedBy:animagent- Identifies tasks created by this frameworkPurpose:video-generation- Marks the task purposeScene: Scene name being processedLanguage: Processing language (english/chinese)
This tagging system ensures that cleanup operations only affect tasks created by your application, preventing interference with other services using the same ECS cluster.
๐ API Reference: execute_parallel_batches()
Complete Parameter List
execute_parallel_batches(
scenes: List[Dict], # Required: List of scene dictionaries
scenes_per_batch: int = 10, # Scenes per Fargate container
max_parallel_tasks: int = 10, # Maximum concurrent containers
language: str = "chinese", # Language: "chinese" or "english"
enable_zoom: bool = True, # Enable zoom in/out effects
config_priority: int = 4, # CPU config (1-5, see table below)
min_scenes_per_batch: int = 5, # Minimum scenes to justify container startup
watermark_path: str = None, # Optional watermark image path
is_portrait: bool = False, # Portrait video mode
saving_dir: str = None, # Output directory (default: ./cloudburst_fargate_results/)
background_box: bool = True, # Show subtitle background
background_opacity: float = 0.2 # Background transparency (0=opaque, 1=transparent)
) -> Dict
Scene Dictionary Format
Each scene in the scenes list must contain:
{
"scene_name": "unique_name", # Required: Unique identifier for the scene
"image_path": "path/to/image", # Required: Path to image file
"audio_path": "path/to/audio", # Required: Path to audio file
"subtitle_path": "path/to/srt" # Optional: Path to subtitle file
}
CPU Configuration Priority
| Priority | vCPU | Memory | Name | Cost/Hour | Best For |
|---|---|---|---|---|---|
| 1 | 2 | 4GB | STANDARD | $0.088 | Most tasks |
| 2 | 4 | 8GB | HIGH_PERFORMANCE | $0.175 | Faster processing |
| 3 | 8 | 16GB | ULTRA_PERFORMANCE | $0.351 | Very fast |
| 4 | 16 | 32GB | MAXIMUM_PERFORMANCE | $0.702 | Fastest (default) |
| 5 | 1 | 2GB | ECONOMY | $0.044 | Cost-sensitive |
Return Value Structure
{
"success": bool, # Overall success status
"total_scenes": int, # Total number of input scenes
"successful_scenes": int, # Successfully processed scenes
"failed_scenes": int, # Failed scenes count
"total_cost_usd": float, # Total cost in USD
"total_duration": float, # Total time in seconds
"downloaded_files": List[str], # Paths to downloaded videos
"task_results": List[Dict], # Individual task results
"tasks_used": int, # Number of Fargate tasks used
"efficiency": {
"speedup_factor": float, # Speedup vs sequential processing
"processing_efficiency": float, # Percentage of time spent processing
"cost_per_scene": float # Average cost per scene
}
}
Smart Distribution Examples
# Example 1: Even distribution
# 50 scenes, batch=10, max_tasks=10 โ 5 tasks ร 10 scenes each
# Example 2: Redistribution for efficiency
# 120 scenes, batch=10, max_tasks=10 โ 10 tasks ร 12 scenes each
# Example 3: Handling remainders
# 101 scenes, batch=10, max_tasks=10 โ 9 tasks ร 10 scenes + 1 task ร 11 scenes
๐ฏ Roadmap
- โ Parallel Processing: Multiple concurrent Fargate tasks
- Fargate Spot: 70% cost reduction for non-critical workloads
- Auto-scaling: Dynamic resource allocation based on queue size
- S3 Integration: Direct file transfer without local downloads
- Webhook Support: Real-time notifications when processing completes
- GPU Support: Fargate GPU instances for AI-intensive workloads
๐ License
MIT License - Same as the original CloudBurst project
From single-task processing to parallel serverless scale - CloudBurst Fargate is production ready! ๐
Stop managing infrastructure, start processing videos at scale.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cloudburst_fargate-1.0.1.tar.gz.
File metadata
- Download URL: cloudburst_fargate-1.0.1.tar.gz
- Upload date:
- Size: 82.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bd997813785bfb3660f0f41a03c15cf47b0ca43ae99daf2271e5f04830773dcc
|
|
| MD5 |
38aff01777ac346132ec27a338b38858
|
|
| BLAKE2b-256 |
8f9f1821a667374836e2718ba6e49c6b0e4b68e21daf4568c25e478ebecca0d3
|
File details
Details for the file cloudburst_fargate-1.0.1-py3-none-any.whl.
File metadata
- Download URL: cloudburst_fargate-1.0.1-py3-none-any.whl
- Upload date:
- Size: 24.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83f8d6d89ba5d0fc5b1f019a3a6171acbe60a45e4c2123e08e7c2447b177117d
|
|
| MD5 |
aab2cc831fd63e869b03d239a5ef10d7
|
|
| BLAKE2b-256 |
656c32d8c19a7c475f69c33ed3fe9bd71c294d26986c36d1eb709a146e6a3329
|