Train multiple programs on multiple servers without pain

These details have not been verified by PyPI

Project links

Project description

Training Noodles

A simple and powerful tool to help training multiple programs on multiple servers with only one human.

Features

Automatically deploys experiments to available servers
No need to change any existing code
Considers CPU usage, GPU usage, memory usage, disk usage, and more
Uses only SSH protocol
Relies on minimal dependencies
Allows fast prototyping

Use Case

If we want to run 4 experiments on 3 servers, more specifically, we need to

Upload the code to one of the servers which has low CPU usage
Run the code on the server
Download experimental results when they're ready

deployment_round_1

In the first deployment round (See image above), Noodles will use the user-defined commands to check CPU usage on the servers.

The CPU usage is high on Server 1 because there are some other programs running, so Noodles uses scp to upload the code Code 1 and Code 2, and run them on Server 2 and Server 3 respectively.

As for how to upload the code, it's just a list of commands written by us, Noodles just follows the commands.

deployment_round_2

In the second deployment round (See image above), we tell Noodles to check experimental results on all servers.

Noodles finds that Server 3 has just finished running Code 2, so it downloads the experimental results and process the data on local machine as we tell it to do so.

deployment_round_3

In the third deployment round (See image above), Code 3 and Code 4 still need to be deployed. Noodles checks the CPU usage on all servers again. As Server 1 has just become free now, Noodles can deploy Code 3 and Code 4 to Server 1 and Server 3 respectively.

The deployment round would continue until all experiments are successfully deployed. As in this case, Noodles will try to download and process the experimental results of Code 1, Code 3 and Code 4 in later rounds.

How Noodles Works

The general procedure is as follows:

Initialize the list of experiments in E
For each deployment round:
1. Initialize the list of servers in S
2. For each experiment in E:
  1. Noodles runs user-defined requirements on each server in S
  2. Noodles compares the metrics (results from the above step) to the user-defined expression
  3. If the expression is satisfied:
    1. Noodles runs the user-defined commands on the satisfied server
    2. Remove the current experiment from E
    3. Remove the satisfied server from S
    4. If S is empty, break
3. If E is empty, break

The implementation of Noodles complies with the following rules:

Simple (User can understand code and spec without looking documentation)
Easy to debug (Noodles can take different actions when different error occurs)
Stateless (The only state Noodles cares about is whether the deployment is successful or not, the states of the experiments must be handled by the user)

Documentation

See full documentation here.

Prerequisites

Linux-based terminals (For Windows, I recommend using git-sdk)
Python 3.5 or higher

Installation

Run the following command:

pip install training-noodles

Usage

noodles <command_type> <path_to_spec>

It's just that simple.

Examples

Here are some examples showing how Noodles is used:

noodles run my_training.yml
noodles status my_training.yml
noodles monitor my_training.yml
noodles stop my_training.yml
noodles download my_training.yml
noodles upload my_training.yml
...

You can also choose only some experiments:

noodles run "my_training.yml:Experiment 1,Experiment 2"

See the example Two Locals to get started. See Train TensorFlow Examples for a more complex example.

Default Spec

Noodles will use properties from default spec if the user spec doesn't specify them. See training_noodles/specs/defaults.yml for the default spec.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.2.2

Jul 26, 2020

This version

1.2.1

Jul 23, 2020

1.2.0

Jul 14, 2019

1.1.1

Jul 14, 2019

1.1.0

Jun 20, 2019

1.0.16

Jun 20, 2019

1.0.15

Jun 20, 2019

1.0.14

Jun 15, 2019

1.0.13

Jun 14, 2019

1.0.12

Jun 14, 2019

1.0.11

Jun 14, 2019

1.0.10

Jun 14, 2019

1.0.9

Jun 11, 2019

1.0.8

Jun 11, 2019

1.0.7

Jun 11, 2019

1.0.6

Jun 9, 2019

1.0.5

Jun 8, 2019

1.0.4

Jun 8, 2019

1.0.3

Jun 8, 2019

1.0.2

Jun 8, 2019

1.0.1

Jun 8, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

training-noodles-1.2.1.tar.gz (33.6 kB view details)

Uploaded Jul 23, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

training_noodles-1.2.1-py3-none-any.whl (35.8 kB view details)

Uploaded Jul 23, 2020 Python 3

File details

Details for the file training-noodles-1.2.1.tar.gz.

File metadata

Download URL: training-noodles-1.2.1.tar.gz
Upload date: Jul 23, 2020
Size: 33.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.6.8

File hashes

Hashes for training-noodles-1.2.1.tar.gz
Algorithm	Hash digest
SHA256	`97bf8f58c149530de9415875a92788a23d0cd552d540905a8bca9d455e5d300f`
MD5	`53bcd66fdeba918ad5597cbe8ca88984`
BLAKE2b-256	`3360e850db265cb6b2221f2247342ae9a4537598068dc74559dccb094d3c8eff`

See more details on using hashes here.

File details

Details for the file training_noodles-1.2.1-py3-none-any.whl.

File metadata

Download URL: training_noodles-1.2.1-py3-none-any.whl
Upload date: Jul 23, 2020
Size: 35.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.6.8

File hashes

Hashes for training_noodles-1.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a90be8650322a8b04e160e9c826948c50f48774f13c58f1a6c89d7579e8562fa`
MD5	`59f9b6e7ec83512571f9e8efa9a3a306`
BLAKE2b-256	`46c4604dc19d84fd84782a3bbf821ddb40bb00e24ceefa58a98daaf6d78298f6`

See more details on using hashes here.

training-noodles 1.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Training Noodles

Features

Use Case

How Noodles Works

Documentation

Prerequisites

Installation

Usage

Examples

Default Spec

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes