Skip to main content

Launches an AWS Elastic MapReduce cluster using templated configuration files written in JSON. Meant to make deployments consistent and reproducable.

Project description

# EMR Launcher

Launches EMR clusters using config files for consistent run-time behavior when setting up a cluster.

## Installing

```
pip install emr_launcher
```

## Usage

Starting a new cluster:
```
emr_launcher launch /path/to/config/<my_config>.json
```

Adding steps to an existing cluster
```
emr_launcher launch /path/to/config/<my_config>.json --job-id <job_id_of_existing_cluster>
```

## Creating configs

the json file maps directly to boto3's `run_job_flow` function found [here](http://boto3.readthedocs.io/en/latest/reference/services/emr.html#EMR.Client.run_job_flow), you can use the documentation as a guide to build your config or build off the [Example Config](https://github.com/tuneinc/emr_launcher/blob/master/example_config.json)

## Template functions

emr_launcher uses templating within the json configuration to call useful functions, for example having an anonymous output location:

```
...
"--conf", "spark.output=s3://mybucket/output/{{ emr_launcher.uuid() }}/
...
```

a full set of usable template functions can be found by running:

```
emr_launcher list-template-functions

emr_launcher
============
emr_launcher.get_environ
Return the environment variables dictionary,
Example: {{ get_environ()['USER'] }}
A parent python program can use "os.environ[key] = value" before calling the emr launcher.

emr_launcher.get_relative_date
Returns a formatted datetime string,
relative to the current time,
as ajusted by the timedelta arguments.
Example:
{{ emr_launcher.get_relative_date(format='%Y-%m-01 00:00:00', timedelta_args=dict(days=-2)) }}

emr_launcher.millis_to_iso
converts a given milliseconds since epoch into an iso date string
Args:
ms_epoch - int
Return
string - formatted date string

emr_launcher.uuid
returns a UUID4 hex string
```

## Plugins

Plugins are discovered by the naming convention `emr_launcher_<plugin-name>` (ex: `emr_launcher_consul`). To install a plugin simply run:
```
pip install emr_launcher_<plugin-name>
```

Available plugins:

[emr_launcher_aws](https://github.com/tuneinc/emr_launcher_aws)

[emr_launcher_consul](https://github.com/tuneinc/emr_launcher_consul)

[emr_launcher_artifactory](https://github.com/tuneinc/emr_launcher_artifactory)


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

emr_launcher-1.1.1.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

emr_launcher-1.1.1-py2-none-any.whl (9.6 kB view details)

Uploaded Python 2

File details

Details for the file emr_launcher-1.1.1.tar.gz.

File metadata

  • Download URL: emr_launcher-1.1.1.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for emr_launcher-1.1.1.tar.gz
Algorithm Hash digest
SHA256 72ea2040b3da787a790bde60fbc169be99fffb9b7f64db44038e971e9514d9ee
MD5 ebc5f6cc975381daf8d04f65c3dfce40
BLAKE2b-256 7b0ff13862da2b0c2e2d7a7de3d2ddd3bb51f47018ddea6cf1a54835a8fe9955

See more details on using hashes here.

File details

Details for the file emr_launcher-1.1.1-py2-none-any.whl.

File metadata

File hashes

Hashes for emr_launcher-1.1.1-py2-none-any.whl
Algorithm Hash digest
SHA256 6a6ac1ee4b3971058cf0ef7dc06c503f5f48b4fc40bbe5677168211ec2f4dc8e
MD5 dd69899c607eb8d016a1f376e92e133c
BLAKE2b-256 37e421f306bf7ee827d17d553ca39f54d54be73ee88da0a53b5121507cb1361a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page