Skip to main content

Serverless video processing using AWS ECS Fargate

Project description

CloudBurst Fargate - Serverless Video Processing

PyPI version Python 3.8+ License: MIT AWS ECS

My second open source project, now powered by AWS ECS Fargate! ๐Ÿš€

Author: Leo Wang (leowang.net)
Email: me@leowang.net
License: MIT

๐Ÿ“š Related Projects:

What is this?

A production-ready Python framework that uses AWS ECS Fargate for serverless, on-demand video generation with parallel processing capabilities.

Core Value: When your application needs to generate videos (using our Video Generation API), this framework:

  • ๐Ÿš€ Starts Fargate containers in 30 seconds (vs 2+ minutes for EC2)
  • โšก Parallel processing: Handle multiple scenes across concurrent containers
  • ๐ŸŽฌ Processes your video generation requests with zero infrastructure management
  • ๐Ÿ“ฅ Downloads completed videos automatically with "process one โ†’ download one" efficiency
  • ๐Ÿ›‘ Containers terminate automatically after processing
  • ๐Ÿ’ฐ Pay-per-second billing with no idle costs

Perfect for: Production applications that need scalable serverless video processing without the complexity of managing EC2 instances.

๐Ÿ“ฆ Installation

Install from PyPI

pip install cloudburst-fargate

Install from GitHub

pip install git+https://github.com/preangelleo/cloudburst-fargate.git

Install from Source

git clone https://github.com/preangelleo/cloudburst-fargate.git
cd cloudburst-fargate
pip install -e .

๐Ÿ†š CloudBurst Evolution: EC2 โ†’ Fargate

Feature CloudBurst EC2 (v1) CloudBurst Fargate (v2)
Startup Time ~75 seconds ~30 seconds โšก
Infrastructure Manage EC2 instances Fully serverless ๐ŸŽฏ
Parallel Processing Single instance only Multiple concurrent tasks ๐Ÿ”„
Availability Subject to quota limits Near 100% availability โœ…
Scaling Limited by EC2 capacity Unlimited concurrent tasks ๐Ÿ“ˆ
Cost Model Per-minute billing Per-second billing ๐Ÿ’ฐ
Idle Costs Risk of forgotten instances Zero idle costs ๐Ÿ”ฅ

๐Ÿš€ Quick Start

1. Install Package

pip install cloudburst-fargate

2. Setup AWS Permissions (CRITICAL)

CloudBurst Fargate requires specific IAM permissions to manage ECS tasks, access VPC resources, and handle container operations. You'll need to add permissions 4 times during setup:

Required IAM Permissions

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecs:RunTask",
        "ecs:StopTask", 
        "ecs:DescribeTasks",
        "ecs:DescribeClusters",
        "ecs:ListTasks",
        "ecs:ListTagsForResource",
        "ecs:TagResource"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeNetworkInterfaces",
        "ec2:DescribeSubnets",
        "ec2:DescribeSecurityGroups",
        "ec2:AuthorizeSecurityGroupIngress",
        "ec2:DescribeVpcs"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogGroup",
        "logs:CreateLogStream",
        "logs:PutLogEvents",
        "logs:DescribeLogGroups",
        "logs:DescribeLogStreams"
      ],
      "Resource": "arn:aws:logs:*:*:log-group:/ecs/cloudburst*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "iam:PassRole"
      ],
      "Resource": "arn:aws:iam::*:role/ecsTaskExecutionRole"
    }
  ]
}

Step-by-Step Permission Setup

  1. First Permission: ECS Task Management

    # Add ECS permissions for running and managing Fargate tasks
    aws iam attach-user-policy --user-name YOUR_USER --policy-arn arn:aws:iam::aws:policy/AmazonECS_FullAccess
    
  2. Second Permission: VPC and Network Access

    # Add EC2 permissions for VPC, subnets, and security groups
    aws iam attach-user-policy --user-name YOUR_USER --policy-arn arn:aws:iam::aws:policy/AmazonEC2ReadOnlyAccess
    
  3. Third Permission: CloudWatch Logs

    # Add CloudWatch permissions for container logging
    aws iam attach-user-policy --user-name YOUR_USER --policy-arn arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
    
  4. Fourth Permission: IAM Role Passing

    # Add permission to pass execution roles to ECS tasks
    aws iam put-user-policy --user-name YOUR_USER --policy-name ECSTaskRolePass --policy-document file://pass-role-policy.json
    

3. Setup Environment

# Copy and customize configuration
cp .env.example .env

# Edit .env with your AWS credentials and VPC settings:
# - AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY (with permissions above)
# - AWS_SUBNET_ID (your VPC subnet with internet access) 
# - AWS_SECURITY_GROUP_ID (allows port 5000 and outbound HTTPS)

4. Test Your Setup

from cloudburst_fargate import FargateOperationV1

# Quick single-scene test
processor = FargateOperationV1(config_priority=1)
scenes = [{
    "scene_name": "test_scene",
    "image_path": "path/to/image.png",
    "audio_path": "path/to/audio.mp3",
    "subtitle_path": "path/to/subtitle.srt"  # Optional
}]

result = processor.execute_batch(scenes, language="english", enable_zoom=True)
print(f"โœ… Generated {result['successful_scenes']} videos")

5. Parallel Processing (Production Ready!)

from cloudburst_fargate.fargate_operation import execute_parallel_batches

# Process multiple scenes across parallel Fargate containers
scenes = [
    {"scene_name": "scene_001", "image_path": "...", "audio_path": "..."},
    {"scene_name": "scene_002", "image_path": "...", "audio_path": "..."},
    {"scene_name": "scene_003", "image_path": "...", "audio_path": "..."},
    {"scene_name": "scene_004", "image_path": "...", "audio_path": "..."}
]

# Automatically distribute across 2 parallel tasks (2 scenes each)
result = execute_parallel_batches(
    scenes=scenes,
    scenes_per_batch=2,        # 2 scenes per Fargate container
    max_parallel_tasks=2,      # 2 concurrent containers
    language="english",
    enable_zoom=True,
    config_priority=1,         # CPU configuration (1-5, default: 4)
    watermark_path=None,       # Optional watermark image
    is_portrait=False,         # Portrait mode (default: False)
    saving_dir="./output",     # Output directory
    background_box=True,       # Subtitle background (default: True)
    background_opacity=0.2     # Background transparency 0-1 (default: 0.2)
)

print(f"๐Ÿš€ Efficiency: {result['efficiency']['speedup_factor']:.2f}x speedup")
print(f"๐Ÿ’ฐ Total cost: ${result['total_cost_usd']:.4f}")
print(f"๐Ÿ“ {len(result['downloaded_files'])} videos downloaded")

โšก Parallel Processing Architecture

CloudBurst Fargate v2 introduces true parallel processing:

Architecture Benefits

  • Concurrent Tasks: Multiple Fargate containers running simultaneously
  • Intelligent Distribution: Scenes automatically distributed across tasks
  • Efficient Workflow: Each task processes scenes โ†’ downloads โ†’ terminates
  • Cost Optimized: Pay only for actual processing time across all containers

Example: 4 Scenes, 2 Tasks

Task 1: Start โ†’ Process scene_001 โ†’ Download โ†’ Process scene_002 โ†’ Download โ†’ Terminate
Task 2: Start โ†’ Process scene_003 โ†’ Download โ†’ Process scene_004 โ†’ Download โ†’ Terminate

Result: 1.8x speedup, all videos downloaded automatically

๐Ÿ“Š Fargate Configuration Options

Choose the right performance level for your workload:

# Economy: 1 vCPU, 2GB RAM (~$0.044/hour) - Light workloads
processor = FargateOperationV1(config_priority=5)

# Standard: 2 vCPU, 4GB RAM (~$0.088/hour) - Most common choice
processor = FargateOperationV1(config_priority=1)  # Default

# High Performance: 4 vCPU, 8GB RAM (~$0.175/hour) - Heavy scenes
processor = FargateOperationV1(config_priority=2)

# Ultra Performance: 8 vCPU, 16GB RAM (~$0.351/hour) - Maximum speed
processor = FargateOperationV1(config_priority=3)

# Maximum Performance: 16 vCPU, 32GB RAM (~$0.702/hour) - Enterprise
processor = FargateOperationV1(config_priority=4)

๐ŸŽฌ Complete Example (Production Ready)

See example_usage.py for comprehensive examples including:

  • All CPU configuration options
  • Complete API parameter reference
  • Single scene processing
  • Batch processing examples
  • Parallel processing configurations
  • Cost optimization strategies
# Quick parallel processing example
from cloudburst_fargate import FargateOperationV1
from cloudburst_fargate.fargate_operation import execute_parallel_batches

result = execute_parallel_batches(
    scenes=your_scenes,
    scenes_per_batch=3,          # Scenes per container
    max_parallel_tasks=4,        # Concurrent containers  
    language="chinese",          # or "english"
    enable_zoom=True,            # Add zoom effects
    config_priority=2,           # High performance config (1-5)
    min_scenes_per_batch=5,      # Min scenes to justify startup (default: 5)
    watermark_path=None,         # Optional watermark
    is_portrait=False,           # Portrait video mode
    saving_dir="./videos",       # Output directory
    background_box=True,         # Show subtitle background
    background_opacity=0.2       # Subtitle transparency
)

# Automatic results:
# โœ… All videos processed and downloaded
# ๐Ÿ’ฐ Optimal cost distribution across parallel tasks
# ๐Ÿ“ˆ Detailed efficiency and timing metrics

๐Ÿ’ก Key Advantages

1. True Serverless with Parallel Scale

  • Per-second billing from container start to finish
  • Multiple concurrent containers for faster processing
  • No risk of forgotten running instances
  • Automatic cleanup guaranteed

2. Zero Infrastructure Management

  • No EC2 instances to monitor
  • No SSH keys or security patches
  • AWS handles all infrastructure and scaling

3. Production Performance

  • 30-second startup vs 75+ seconds for EC2
  • Parallel processing across multiple containers
  • Intelligent scene distribution and load balancing
  • Consistent performance (no "noisy neighbor" issues)

4. Enterprise Ready

  • Built-in high availability and auto-retry
  • Integrated with AWS CloudWatch logging
  • VPC networking support
  • Cost tracking and optimization

๐Ÿ’ฐ Cost Comparison

Example: Processing 8 video scenes

Approach Configuration Time Cost Efficiency
Sequential (Single Task) 2 vCPU 16 min $0.024 1.0x
๐Ÿ† Parallel (4 Tasks ร— 2 Scenes) 2 vCPU each 9 min $0.026 1.8x faster
24/7 GPU Server Always on - ~$500/month -

Key Insight: Minimal cost increase (8%) for 80% time reduction!

๐Ÿ”ง Advanced Features

Intelligent Scene Distribution

The framework automatically:

  • Distributes scenes evenly across parallel tasks
  • Handles remainder scenes when batch sizes don't divide evenly
  • Optimizes for cost vs speed based on your configuration

Real-time Monitoring

# Built-in cost tracking and performance metrics
result = execute_parallel_batches(scenes=scenes, ...)

print(f"Tasks used: {result['tasks_used']}")
print(f"Processing efficiency: {result['efficiency']['processing_efficiency']:.1f}%")
print(f"Speedup factor: {result['efficiency']['speedup_factor']:.2f}x")
print(f"Cost per scene: ${result['total_cost_usd']/len(scenes):.4f}")

Flexible Configuration

# Environment variables or .env file
AWS_ACCESS_KEY_ID=your_access_key
AWS_SECRET_ACCESS_KEY=your_secret_key
AWS_SUBNET_ID=subnet-xxxxxxxxx
AWS_SECURITY_GROUP_ID=sg-xxxxxxxxx
ECS_CLUSTER_NAME=cloudburst-cluster
ECS_TASK_DEFINITION=cloudburst-task

๐Ÿ› ๏ธ File Structure

After cleanup, the project structure is:

cloudburst_fargate/
โ”œโ”€โ”€ fargate_operation_v1.py    # Core Fargate operations and parallel processing
โ”œโ”€โ”€ example_usage.py           # Complete usage examples and API reference
โ”œโ”€โ”€ README.md                  # This file
โ”œโ”€โ”€ README_CN.md              # Chinese documentation
โ”œโ”€โ”€ requirements.txt          # Python dependencies
โ”œโ”€โ”€ .env.example             # Environment template
โ”œโ”€โ”€ Docs/                    # Technical documentation
โ””โ”€โ”€ backup_test_files/       # Test files (Git ignored)

๐Ÿ”ง Troubleshooting

Common Issues

Task fails to start:

  • Check subnet and security group IDs in .env
  • Ensure subnet has internet access (public subnet or NAT gateway)
  • Verify AWS credentials with correct permissions

Network errors:

  • Security group must allow outbound HTTPS (port 443) for Docker pulls
  • Security group must allow inbound TCP port 5000 for API access

Permission errors:

  • Verify AWS credentials: aws sts get-caller-identity
  • Required IAM permissions: ECS, ECR, CloudWatch, EC2 (for VPC)

Debug Mode

# Enable detailed AWS logging
import logging
logging.basicConfig(level=logging.DEBUG)

# Or check CloudWatch logs: /ecs/cloudburst

Task Monitoring and Management (New in v2)

CloudBurst Fargate now includes advanced task monitoring and cleanup capabilities to ensure reliable production operations:

List Running Tasks

from fargate_operation_v1 import FargateOperationV1

# Initialize the operation
fargate_op = FargateOperationV1()

# List all running Fargate tasks created by animagent
running_tasks = fargate_op.list_running_tasks(filter_animagent_only=True)

for task in running_tasks:
    print(f"Task: {task['task_arn']}")
    print(f"Status: {task['status']}")
    print(f"Started: {task['started_at']}")
    print(f"Public IP: {task['public_ip']}")
    print(f"Tags: {task['tags']}")

Cleanup Stale Tasks

# Cleanup all animagent-created tasks (double security mechanism)
cleanup_result = fargate_op.cleanup_all_tasks(
    reason="Scheduled cleanup",
    filter_animagent_only=True  # Only cleanup tasks tagged with CreatedBy=animagent
)

print(f"Cleanup result: {cleanup_result['message']}")
print(f"Tasks terminated: {cleanup_result['terminated_count']}")
print(f"Failed cleanups: {cleanup_result['failed_count']}")

Task Identification

All tasks created by CloudBurst Fargate are automatically tagged for easy identification:

  • CreatedBy: animagent - Identifies tasks created by this framework
  • Purpose: video-generation - Marks the task purpose
  • Scene: Scene name being processed
  • Language: Processing language (english/chinese)

This tagging system ensures that cleanup operations only affect tasks created by your application, preventing interference with other services using the same ECS cluster.

๐Ÿ“š API Reference: execute_parallel_batches()

Complete Parameter List

execute_parallel_batches(
    scenes: List[Dict],              # Required: List of scene dictionaries
    scenes_per_batch: int = 10,      # Scenes per Fargate container
    max_parallel_tasks: int = 10,    # Maximum concurrent containers
    language: str = "chinese",       # Language: "chinese" or "english"
    enable_zoom: bool = True,        # Enable zoom in/out effects
    config_priority: int = 4,        # CPU config (1-5, see table below)
    min_scenes_per_batch: int = 5,   # Minimum scenes to justify container startup
    watermark_path: str = None,      # Optional watermark image path
    is_portrait: bool = False,       # Portrait video mode
    saving_dir: str = None,          # Output directory (default: ./cloudburst_fargate_results/)
    background_box: bool = True,     # Show subtitle background
    background_opacity: float = 0.2  # Background transparency (0=opaque, 1=transparent)
) -> Dict

Scene Dictionary Format

Each scene in the scenes list must contain:

{
    "scene_name": "unique_name",     # Required: Unique identifier for the scene
    "image_path": "path/to/image",   # Required: Path to image file
    "audio_path": "path/to/audio",   # Required: Path to audio file
    "subtitle_path": "path/to/srt"   # Optional: Path to subtitle file
}

CPU Configuration Priority

Priority vCPU Memory Name Cost/Hour Best For
1 2 4GB STANDARD $0.088 Most tasks
2 4 8GB HIGH_PERFORMANCE $0.175 Faster processing
3 8 16GB ULTRA_PERFORMANCE $0.351 Very fast
4 16 32GB MAXIMUM_PERFORMANCE $0.702 Fastest (default)
5 1 2GB ECONOMY $0.044 Cost-sensitive

Return Value Structure

{
    "success": bool,                    # Overall success status
    "total_scenes": int,                # Total number of input scenes
    "successful_scenes": int,           # Successfully processed scenes
    "failed_scenes": int,               # Failed scenes count
    "total_cost_usd": float,            # Total cost in USD
    "total_duration": float,            # Total time in seconds
    "downloaded_files": List[str],      # Paths to downloaded videos
    "task_results": List[Dict],         # Individual task results
    "tasks_used": int,                  # Number of Fargate tasks used
    "efficiency": {
        "speedup_factor": float,        # Speedup vs sequential processing
        "processing_efficiency": float,  # Percentage of time spent processing
        "cost_per_scene": float         # Average cost per scene
    }
}

Smart Distribution Examples

# Example 1: Even distribution
# 50 scenes, batch=10, max_tasks=10 โ†’ 5 tasks ร— 10 scenes each

# Example 2: Redistribution for efficiency  
# 120 scenes, batch=10, max_tasks=10 โ†’ 10 tasks ร— 12 scenes each

# Example 3: Handling remainders
# 101 scenes, batch=10, max_tasks=10 โ†’ 9 tasks ร— 10 scenes + 1 task ร— 11 scenes

๐ŸŽฏ Roadmap

  • โœ… Parallel Processing: Multiple concurrent Fargate tasks
  • Fargate Spot: 70% cost reduction for non-critical workloads
  • Auto-scaling: Dynamic resource allocation based on queue size
  • S3 Integration: Direct file transfer without local downloads
  • Webhook Support: Real-time notifications when processing completes
  • GPU Support: Fargate GPU instances for AI-intensive workloads

๐Ÿ“„ License

MIT License - Same as the original CloudBurst project


From single-task processing to parallel serverless scale - CloudBurst Fargate is production ready! ๐Ÿš€

Stop managing infrastructure, start processing videos at scale.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloudburst_fargate-1.0.1.tar.gz (82.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cloudburst_fargate-1.0.1-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file cloudburst_fargate-1.0.1.tar.gz.

File metadata

  • Download URL: cloudburst_fargate-1.0.1.tar.gz
  • Upload date:
  • Size: 82.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for cloudburst_fargate-1.0.1.tar.gz
Algorithm Hash digest
SHA256 bd997813785bfb3660f0f41a03c15cf47b0ca43ae99daf2271e5f04830773dcc
MD5 38aff01777ac346132ec27a338b38858
BLAKE2b-256 8f9f1821a667374836e2718ba6e49c6b0e4b68e21daf4568c25e478ebecca0d3

See more details on using hashes here.

File details

Details for the file cloudburst_fargate-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for cloudburst_fargate-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 83f8d6d89ba5d0fc5b1f019a3a6171acbe60a45e4c2123e08e7c2447b177117d
MD5 aab2cc831fd63e869b03d239a5ef10d7
BLAKE2b-256 656c32d8c19a7c475f69c33ed3fe9bd71c294d26986c36d1eb709a146e6a3329

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page