Launches an AWS Elastic MapReduce cluster using templated configuration files written in JSON. Meant to make deployments consistent and reproducable.
Project description
# EMR Launcher
Launches EMR clusters using config files for consistent run-time behavior when setting up a cluster.
## Installing
```
pip install emr_launcher
```
## Usage
Starting a new cluster:
```
emr_launcher launch /path/to/config/<my_config>.json
```
Adding steps to an existing cluster
```
emr_launcher launch /path/to/config/<my_config>.json --job-id <job_id_of_existing_cluster>
```
## Creating configs
the json file maps directly to boto3's `run_job_flow` function found [here](http://boto3.readthedocs.io/en/latest/reference/services/emr.html#EMR.Client.run_job_flow), you can use the documentation as a guide to build your config or build off the [Example Config](https://github.com/tuneinc/emr_launcher/blob/master/example_config.json)
## Template functions
emr_launcher uses templating within the json configuration to call useful functions, for example having an anonymous output location:
```
...
"--conf", "spark.output=s3://mybucket/output/{{ emr_launcher.uuid() }}/
...
```
a full set of usable template functions can be found by running:
```
emr_launcher list-template-functions
emr_launcher
============
emr_launcher.get_environ
Return the environment variables dictionary,
Example: {{ get_environ()['USER'] }}
A parent python program can use "os.environ[key] = value" before calling the emr launcher.
emr_launcher.get_relative_date
Returns a formatted datetime string,
relative to the current time,
as ajusted by the timedelta arguments.
Example:
{{ emr_launcher.get_relative_date(format='%Y-%m-01 00:00:00', timedelta_args=dict(days=-2)) }}
emr_launcher.millis_to_iso
converts a given milliseconds since epoch into an iso date string
Args:
ms_epoch - int
Return
string - formatted date string
emr_launcher.uuid
returns a UUID4 hex string
```
## Plugins
Plugins are discovered by the naming convention `emr_launcher_<plugin-name>` (ex: `emr_launcher_consul`). To install a plugin simply run:
```
pip install emr_launcher_<plugin-name>
```
Available plugins:
[emr_launcher_aws](https://github.com/tuneinc/emr_launcher_aws)
[emr_launcher_consul](https://github.com/tuneinc/emr_launcher_consul)
[emr_launcher_artifactory](https://github.com/tuneinc/emr_launcher_artifactory)
Launches EMR clusters using config files for consistent run-time behavior when setting up a cluster.
## Installing
```
pip install emr_launcher
```
## Usage
Starting a new cluster:
```
emr_launcher launch /path/to/config/<my_config>.json
```
Adding steps to an existing cluster
```
emr_launcher launch /path/to/config/<my_config>.json --job-id <job_id_of_existing_cluster>
```
## Creating configs
the json file maps directly to boto3's `run_job_flow` function found [here](http://boto3.readthedocs.io/en/latest/reference/services/emr.html#EMR.Client.run_job_flow), you can use the documentation as a guide to build your config or build off the [Example Config](https://github.com/tuneinc/emr_launcher/blob/master/example_config.json)
## Template functions
emr_launcher uses templating within the json configuration to call useful functions, for example having an anonymous output location:
```
...
"--conf", "spark.output=s3://mybucket/output/{{ emr_launcher.uuid() }}/
...
```
a full set of usable template functions can be found by running:
```
emr_launcher list-template-functions
emr_launcher
============
emr_launcher.get_environ
Return the environment variables dictionary,
Example: {{ get_environ()['USER'] }}
A parent python program can use "os.environ[key] = value" before calling the emr launcher.
emr_launcher.get_relative_date
Returns a formatted datetime string,
relative to the current time,
as ajusted by the timedelta arguments.
Example:
{{ emr_launcher.get_relative_date(format='%Y-%m-01 00:00:00', timedelta_args=dict(days=-2)) }}
emr_launcher.millis_to_iso
converts a given milliseconds since epoch into an iso date string
Args:
ms_epoch - int
Return
string - formatted date string
emr_launcher.uuid
returns a UUID4 hex string
```
## Plugins
Plugins are discovered by the naming convention `emr_launcher_<plugin-name>` (ex: `emr_launcher_consul`). To install a plugin simply run:
```
pip install emr_launcher_<plugin-name>
```
Available plugins:
[emr_launcher_aws](https://github.com/tuneinc/emr_launcher_aws)
[emr_launcher_consul](https://github.com/tuneinc/emr_launcher_consul)
[emr_launcher_artifactory](https://github.com/tuneinc/emr_launcher_artifactory)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
emr_launcher-1.1.1.tar.gz
(6.4 kB
view hashes)
Built Distribution
Close
Hashes for emr_launcher-1.1.1-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a6ac1ee4b3971058cf0ef7dc06c503f5f48b4fc40bbe5677168211ec2f4dc8e |
|
MD5 | dd69899c607eb8d016a1f376e92e133c |
|
BLAKE2b-256 | 37e421f306bf7ee827d17d553ca39f54d54be73ee88da0a53b5121507cb1361a |