Launches an AWS Elastic MapReduce cluster using templated configuration files written in JSON. Meant to make deployments consistent and reproducable.
Project description
# EMR Launcher
Launches EMR clusters using config files for consistent run-time behavior when setting up a cluster.
## Installing
```
pip install emr_launcher
```
## Usage
Starting a new cluster:
```
emr_launcher launch /path/to/config/<my_config>.json
```
Adding steps to an existing cluster
```
emr_launcher launch /path/to/config/<my_config>.json --job-id <job_id_of_existing_cluster>
```
## Creating configs
the json file maps directly to boto3's `run_job_flow` function found [here](http://boto3.readthedocs.io/en/latest/reference/services/emr.html#EMR.Client.run_job_flow), you can use the documentation as a guide to build your config or build off the [Example Config](https://github.com/tuneinc/emr_launcher/blob/master/example_config.json)
## Template functions
emr_launcher uses templating within the json configuration to call useful functions, for example having an anonymous output location:
```
...
"--conf", "spark.output=s3://mybucket/output/{{ emr_launcher.uuid() }}/
...
```
a full set of usable template functions can be found by running:
```
emr_launcher list-template-functions
emr_launcher
============
emr_launcher.get_environ
Return the environment variables dictionary,
Example: {{ get_environ()['USER'] }}
A parent python program can use "os.environ[key] = value" before calling the emr launcher.
emr_launcher.get_relative_date
Returns a formatted datetime string,
relative to the current time,
as ajusted by the timedelta arguments.
Example:
{{ emr_launcher.get_relative_date(format='%Y-%m-01 00:00:00', timedelta_args=dict(days=-2)) }}
emr_launcher.millis_to_iso
converts a given milliseconds since epoch into an iso date string
Args:
ms_epoch - int
Return
string - formatted date string
emr_launcher.uuid
returns a UUID4 hex string
```
## Plugins
Plugins are discovered by the naming convention `emr_launcher_<plugin-name>` (ex: `emr_launcher_consul`). To install a plugin simply run:
```
pip install emr_launcher_<plugin-name>
```
Available plugins:
[emr_launcher_aws](https://github.com/tuneinc/emr_launcher_aws)
[emr_launcher_consul](https://github.com/tuneinc/emr_launcher_consul)
[emr_launcher_artifactory](https://github.com/tuneinc/emr_launcher_artifactory)
Launches EMR clusters using config files for consistent run-time behavior when setting up a cluster.
## Installing
```
pip install emr_launcher
```
## Usage
Starting a new cluster:
```
emr_launcher launch /path/to/config/<my_config>.json
```
Adding steps to an existing cluster
```
emr_launcher launch /path/to/config/<my_config>.json --job-id <job_id_of_existing_cluster>
```
## Creating configs
the json file maps directly to boto3's `run_job_flow` function found [here](http://boto3.readthedocs.io/en/latest/reference/services/emr.html#EMR.Client.run_job_flow), you can use the documentation as a guide to build your config or build off the [Example Config](https://github.com/tuneinc/emr_launcher/blob/master/example_config.json)
## Template functions
emr_launcher uses templating within the json configuration to call useful functions, for example having an anonymous output location:
```
...
"--conf", "spark.output=s3://mybucket/output/{{ emr_launcher.uuid() }}/
...
```
a full set of usable template functions can be found by running:
```
emr_launcher list-template-functions
emr_launcher
============
emr_launcher.get_environ
Return the environment variables dictionary,
Example: {{ get_environ()['USER'] }}
A parent python program can use "os.environ[key] = value" before calling the emr launcher.
emr_launcher.get_relative_date
Returns a formatted datetime string,
relative to the current time,
as ajusted by the timedelta arguments.
Example:
{{ emr_launcher.get_relative_date(format='%Y-%m-01 00:00:00', timedelta_args=dict(days=-2)) }}
emr_launcher.millis_to_iso
converts a given milliseconds since epoch into an iso date string
Args:
ms_epoch - int
Return
string - formatted date string
emr_launcher.uuid
returns a UUID4 hex string
```
## Plugins
Plugins are discovered by the naming convention `emr_launcher_<plugin-name>` (ex: `emr_launcher_consul`). To install a plugin simply run:
```
pip install emr_launcher_<plugin-name>
```
Available plugins:
[emr_launcher_aws](https://github.com/tuneinc/emr_launcher_aws)
[emr_launcher_consul](https://github.com/tuneinc/emr_launcher_consul)
[emr_launcher_artifactory](https://github.com/tuneinc/emr_launcher_artifactory)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
emr_launcher-1.1.1.tar.gz
(6.4 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file emr_launcher-1.1.1.tar.gz.
File metadata
- Download URL: emr_launcher-1.1.1.tar.gz
- Upload date:
- Size: 6.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72ea2040b3da787a790bde60fbc169be99fffb9b7f64db44038e971e9514d9ee
|
|
| MD5 |
ebc5f6cc975381daf8d04f65c3dfce40
|
|
| BLAKE2b-256 |
7b0ff13862da2b0c2e2d7a7de3d2ddd3bb51f47018ddea6cf1a54835a8fe9955
|
File details
Details for the file emr_launcher-1.1.1-py2-none-any.whl.
File metadata
- Download URL: emr_launcher-1.1.1-py2-none-any.whl
- Upload date:
- Size: 9.6 kB
- Tags: Python 2
- Uploaded using Trusted Publishing? No
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a6ac1ee4b3971058cf0ef7dc06c503f5f48b4fc40bbe5677168211ec2f4dc8e
|
|
| MD5 |
dd69899c607eb8d016a1f376e92e133c
|
|
| BLAKE2b-256 |
37e421f306bf7ee827d17d553ca39f54d54be73ee88da0a53b5121507cb1361a
|