Skip to main content

Script running tool for optimizing GPU memory utilization.

Project description

TRIAGE

Intended use: running a multitude of GPU-intensive scripts in a way that optimizes GPU memory utilization. Great for ML/DL based experiments on servers shared between several users.

Installation

pip install triage-runner

Usage

See --help option for extended list of possible arguments.
Running one config:

triage run_config.json

Running several configs:

triage run_config1.json run_config2.json run_config3.json 

Patterns can be used for config discovery as well:

triage run_config*.json

More on pattern syntax can be found here: https://docs.python.org/3.10/library/pathlib.html#pathlib.Path.glob

Run configurations

Stored in JSON format. The sample run configuration looks like this:

{
  "memory_needed": 10.0,
  "config_name": "sample_config",
  "command": "python3 train.py",
  "args": [
    "arg1",
    "--arg2",
    ["--seed=1", "--seed=2", "--seed=3"],
    "--arg3=3"
  ]
}

Every entry in args list is an argument for command. An entry can be a list - in which case TRIAGE will iterate through all the possible combinations of all values in list entries. The example script above will be run 3 times with an argument --seed set to 1, 2 and 3.

Parameter config_name is optional and is used for logging the results (see --logfile option TODO). Based on this parameter environment variable TASK_NAME is set in order to be used by running script.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

triage-runner-0.4.5.tar.gz (9.2 kB view hashes)

Uploaded Source

Built Distribution

triage_runner-0.4.5-py3-none-any.whl (10.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page