Optimize AWS EMR spark settings (spark-config-cheatsheet)
Project description
# Spark-optimizer
[![Build Status](https://api.travis-ci.org/delijati/spark-optimizer.svg?branch=master)](https://travis-ci.org/delijati/spark-optimizer)
Optimize spark settings (for cluster aka yarn run)
Original source: http://c2fo.io/c2fo/spark/aws/emr/2016/07/06/apache-spark-config-cheatsheet/
## Usage
Install:
$ virtualenv env
$ env/bin/pip install spark-optimizer
Dev install:
$ virtualenv env
$ env/bin/pip install -e .
Generate settings for `c4.4xlarge` with `4` nodes:
$ env/bin/spark-optimizer c4.4xlarge 4
{'spark.default.parallelism': '108',
'spark.driver.cores': '2',
'spark.driver.maxResultSize': '3481m',
'spark.driver.memory': '3481m',
'spark.driver.memoryOverhead': '614m',
'spark.executor.cores': '2',
'spark.executor.instances': '27',
'spark.executor.memory': '3481m',
'spark.executor.memoryOverhead': '614m'}
Update instance info:
$ env/bin/python spark_optimizer/emr_update.py
# CHANGES
0.1.1 (2018-09-12)
------------------
- fix email
0.1.0 (2018-09-12)
- initial release
[![Build Status](https://api.travis-ci.org/delijati/spark-optimizer.svg?branch=master)](https://travis-ci.org/delijati/spark-optimizer)
Optimize spark settings (for cluster aka yarn run)
Original source: http://c2fo.io/c2fo/spark/aws/emr/2016/07/06/apache-spark-config-cheatsheet/
## Usage
Install:
$ virtualenv env
$ env/bin/pip install spark-optimizer
Dev install:
$ virtualenv env
$ env/bin/pip install -e .
Generate settings for `c4.4xlarge` with `4` nodes:
$ env/bin/spark-optimizer c4.4xlarge 4
{'spark.default.parallelism': '108',
'spark.driver.cores': '2',
'spark.driver.maxResultSize': '3481m',
'spark.driver.memory': '3481m',
'spark.driver.memoryOverhead': '614m',
'spark.executor.cores': '2',
'spark.executor.instances': '27',
'spark.executor.memory': '3481m',
'spark.executor.memoryOverhead': '614m'}
Update instance info:
$ env/bin/python spark_optimizer/emr_update.py
# CHANGES
0.1.1 (2018-09-12)
------------------
- fix email
0.1.0 (2018-09-12)
- initial release
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
spark_optimizer-0.1.1.tar.gz
(5.7 kB
view details)
File details
Details for the file spark_optimizer-0.1.1.tar.gz
.
File metadata
- Download URL: spark_optimizer-0.1.1.tar.gz
- Upload date:
- Size: 5.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.25.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 543d90082a72f0a859dac735a633a652c270b137c8a80bd0eaf40ede24878dec |
|
MD5 | 1655012e509f8577a9c75f43d7a11ba7 |
|
BLAKE2b-256 | 518812d5651c9d6475351488298df01a5630908fab68bba58d4ecd48935adaf2 |