Skip to main content

Library to manage AWS EMR clusters.

Project description

The EMR Helper library tries to help when setting up and managing an EMR cluster.

AWS EMR has three distinct objects:

  • Cluster

  • Fleet

  • Step

This library collects some of the most common types of these elements and manages them at the python class level.

Dependencies

  • Boto3

Classes

Step

There is a general class Step to wrap all subtypes of steps. Currently there are only CommandRunnerStep implemented to launch a Command Runner step.

You can create a step as follow:

import boto3
from emrhelper import CommandRunnerStep

step = CommandRunnerStep(
            name='StepName',
            args=process_arguments)

It can be added to a cluster before it starts (see Cluster) or append to a started cluster:

step.run_on_cluster(‘clusterID’)

If clusterID is None, step is added to any available cluster.

Fleet

The instance fleets configuration for a cluster contains instances information for computation capacity of cluster. As steps, there is a main class ‘Fleet’ and several subclasses: OnDemandFleet and SpotFleet, depending on whether you want to launch spot or on-demand instances.

You can create a fleet as follow:

from emrhelper import SpotFleet

fleet = SpotFleet(name='My Fleet', capacity=4, fleet_type='CORE')

fleet.add_instance_config(instance_type='r5d.xlarge', capacity=2)
fleet.add_instance_config(instance_type='r5d.2xlarge', capacity=4)

Cluster

You can create a cluster, add steps and fleets, and run it.

from emrhelper import Cluster

cluster = Cluster(
    name='my-cluster',
    key_pair='keypair',
    subnets='...',
    sg_master='...',
    sg_slave='...',
    sg_service='...',
    instance_profile='...',
    service_role='...',
    log_uri='...'
)

cluster.add_step(step)
cluster.add_fleet(fleet)

cluster.run_cluster()

You can add as many steps and fleets as you need.

Installation

pip install emrhelper

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emr-helper-1.0.0.tar.gz (7.0 kB view hashes)

Uploaded Source

Built Distribution

emr_helper-1.0.0-py3-none-any.whl (19.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page