Skip to main content

# FromConfig Yarn

Project description

FromConfig Yarn

pypi ci

A fromconfig Launcher for yarn execution.

Install

pip install fromconfig_yarn

Quickstart

Once installed, the launcher is available with the name yarn.

Given the following module

class Model:
    def __init__(self, learning_rate: float):
        self.learning_rate = learning_rate

    def train(self):
        print(f"Training model with learning_rate {self.learning_rate}")

and config files

# config.yaml
model:
  _attr_: foo.Model
  learning_rate: "${params.learning_rate}"

# params.yaml
params:
  learning_rate: 0.001

# launcher.yaml
yarn:
  name: test-fromconfig

logging:
  level: 20

launcher:
  run: yarn

Run (assuming you are in a Hadoop environment)

fromconfig config.yaml params.yaml launcher.yaml - model - train

Which prints

INFO:fromconfig.launcher.logger:- yarn.name: test-fromconfig
INFO:fromconfig.launcher.logger:- logging.level: 20
INFO:fromconfig.launcher.logger:- params.learning_rate: 0.001
INFO:fromconfig.launcher.logger:- model._attr_: foo.Model
INFO:fromconfig.launcher.logger:- model.learning_rate: 0.001
INFO skein.Driver: Driver started, listening on 12345
INFO:fromconfig_yarn.launcher:Uploading pex to viewfs://root/user/path/to/pex
INFO:cluster_pack.filesystem:Resolved base filesystem: <class 'pyarrow.hdfs.HadoopFileSystem'>
INFO:cluster_pack.uploader:Zipping and uploading your env to viewfs://root/user/path/to/pex
INFO skein.Driver: Uploading application resources to viewfs://root/user/...
INFO skein.Driver: Submitting application...
INFO impl.YarnClientImpl: Submitted application application_12345
INFO:fromconfig_yarn.launcher:TRACKING_URL: http://12.34.56/application_12345

You can also monkeypatch the relevant functions to "fake" the Hadoop environment with

python monkeypatch_fromconfig.py config.yaml params.yaml launcher.yaml - model - train

This example can be found in docs/examples/quickstart.

Usage Reference

Options

To configure Yarn, add a yarn entry to your config.

You can set the following parameters.

  • env_vars: A list of environment variables to forward to the container(s)
  • hadoop_file_systems: The list of available filesystems
  • ignored_packages: The list of packages not to include in the environment
  • jvm_memory_in_gb: The JVM memory (default, 8)
  • memory: The executor's memory (default, 32 GiB)
  • num_cores: The executor's number of cores (default, 8)
  • package_path: The HDFS location where to save the environment
  • zip_file: The path to an existing pex file, either local or on HDFS
  • name: The application name
  • queue: The yarn queue to submit the application to
  • node_label: The label of the hadoop node to be scheduled
  • pre_script_hook: A script to be executed before python is invoked
  • extra_env_vars: A mapping of extra environment variables to forward to the container(s)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fromconfig_yarn-0.1.3.tar.gz (9.1 kB view details)

Uploaded Source

File details

Details for the file fromconfig_yarn-0.1.3.tar.gz.

File metadata

  • Download URL: fromconfig_yarn-0.1.3.tar.gz
  • Upload date:
  • Size: 9.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for fromconfig_yarn-0.1.3.tar.gz
Algorithm Hash digest
SHA256 959cce3afe78098bf1fc8f4b92bec7a02d67191d2dc6fedeaddb5a2a34ba0331
MD5 a07712ea6555f733f53ca99b02bbe1a3
BLAKE2b-256 d21975fcc041a83d67f6e14887533451abe938779f5bbceb7a152b28b50e6766

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page