Skip to main content

Library to manage AWS EMR clusters.

Project description

The EMR Helper library tries to help when setting up and managing an EMR cluster.

AWS EMR has three distinct objects:

  • Cluster

  • Fleet

  • Step

This library collects some of the most common types of these elements and manages them at the python class level.

Dependencies

  • Boto3

Classes

Step

There is a general class Step to wrap all subtypes of steps. Currently there are only CommandRunnerStep implemented to launch a Command Runner step.

You can create a step as follow:

import boto3
from emrhelper import CommandRunnerStep

step = CommandRunnerStep(
            name='StepName',
            args=process_arguments)

It can be added to a cluster before it starts (see Cluster) or append to a started cluster:

step.run_on_cluster(‘clusterID’)

If clusterID is None, step is added to any available cluster.

Fleet

The instance fleets configuration for a cluster contains instances information for computation capacity of cluster. As steps, there is a main class ‘Fleet’ and several subclasses: OnDemandFleet and SpotFleet, depending on whether you want to launch spot or on-demand instances.

You can create a fleet as follow:

from emrhelper import SpotFleet

fleet = SpotFleet(name='My Fleet', capacity=4, fleet_type='CORE')

fleet.add_instance_config(instance_type='r5d.xlarge', capacity=2)
fleet.add_instance_config(instance_type='r5d.2xlarge', capacity=4)

Cluster

You can create a cluster, add steps and fleets, and run it.

from emrhelper import Cluster

cluster = Cluster(
    name='my-cluster',
    key_pair='keypair',
    subnets='...',
    sg_master='...',
    sg_slave='...',
    sg_service='...',
    instance_profile='...',
    service_role='...',
    log_uri='...'
)

cluster.add_step(step)
cluster.add_fleet(fleet)

cluster.run_cluster()

You can add as many steps and fleets as you need.

Installation

pip install emrhelper

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emr-helper-1.0.0.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

emr_helper-1.0.0-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file emr-helper-1.0.0.tar.gz.

File metadata

  • Download URL: emr-helper-1.0.0.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.2

File hashes

Hashes for emr-helper-1.0.0.tar.gz
Algorithm Hash digest
SHA256 bd1f57c74631299486d30a8319c16c8c80b3ebc1fe292acbee979d67a705e691
MD5 b3b7d1a9a459a464615012d6fcc63f0e
BLAKE2b-256 7c67ec1ef7f1eb5cfdbea74839e544e7d8760e52530f6cd6a6cc6b164fe1ec2f

See more details on using hashes here.

File details

Details for the file emr_helper-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: emr_helper-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.8.2

File hashes

Hashes for emr_helper-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 af367ec50bfaa6872144f20899a405c1b5018f6ee4f0f5695d0a53587b5cdec3
MD5 1bd83c0c24ca28290e0f9f97e1ff6b5d
BLAKE2b-256 44f2e458df3e1dae5b0eaa617498d23805cbc457d07e9d551948611d42878a01

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page