Skip to main content

Parallelize any program and aggregate results from stdout.

Project description

Patas 🐾

Patas is a command line utility designed to execute any program in parallel and collect its output, varying its input parameters and starting the programs automatically. The script may be parallelized on your local machine or in a cluster. The only requirement to run it on a cluster is an SSH connection between the machines. Assuming the programs are located on each worker machine, patas can start and manage the parallel programs with one command. Parsing the outputs is done in a second command. Its name means PArser and TAsk Scheduler.

When should I use Patas? ⭐

Use this program if you want to evaluate your model against multiple parameters and measure your own performance metrics. It is a quick way to parallelize an experiment over various machines. It is also handy when you don't want to, or can't, change the original program.

When should I not use Patas? 🚧

Patas is designed to be a simple command line utility. It will not manage and constrain resource usage, like limiting the amount of RAM, cores, or disk used by the process. The only control available is the number of workers in each machine, representing how many processes we want to execute in the given machine. The reason for this is that when we constrain the process with a given amount of resources, like RAM, it's possible that a process will run out of memory, and the entire system has plenty of it available. The workaround is to estimate the number of workers based on how many resources your program needs. If a machine crashes, its ok, you can stop and execute patas again. It will skip completed tasks and continue from where it stopped.

Basic usage 🐣

Considering a use case that we need to find the optimal configuration for a neural network, variating the number of hidden neurons and the activation function in the hidden layer. The following is a basic mockup script that receives these two input parameters. It pretends it has trained a model and prints relevant information to stdout. We will assume it is saved in the file $HOME/Sources/patas/examples/sanity/main.py.

#!/usr/bin/env python3

import sys

# Read input parameters
hidden_neurons      = sys.argv[1]
activation_function = sys.argv[2]

# Pretend we have done something for a long time
print("Loading dataset...")
print("Applying transformations...")
print("Training model...")
print("Evaluating model...")

w = abs(int(hidden_neurons) + len(activation_function)) / 10
train_accuracy = 0.9 + 0.1 / (1+  w)
test_accuracy  = 0.9 + 0.1 / (1+2*w)

# Print relevant results
print("Results:")
print(f"    Train accuracy: {train_accuracy:.3f}")
print(f"    Test accuracy:  {test_accuracy:.3f}")

Assuming we want to vary the number of hidden neurons in the range [10, 20, 30] and the activation function in ['sigmoid', 'relu'], we can parallelize the script above and collect its output using patas exec grid. For example, executing the following command from $HOME/Sources/patas/, patas will create the output folder named tmp holding the experiment's outputs.

patas exec grid \
    --cmd './main.py {hidden_neurons} {activation_function}' \
    --vl hidden_neurons 10 20 30 \
    --vl activation_function sigmoid relu \
    --workdir '$HOME/Sources/patas/examples/sanity' \
    --repeat 2 \
    -o tmp

When the experiment is done, we can parse the outputs and collect desired values using patas parse.

patas parse \
    -e 'tmp/quick_experiment/' \
    -p TRAIN_ACC  'Train accuracy: (@float@)' \
    -p TEST_ACC   'Test accuracy:  (@float@)'

This will generate the file $HOME/Sources/patas/tmp/quick_experiment/output.csv, containing a table with the collected results, input variables and many other variables associated to the experiment. Its data should be similar to those displayed below.

VAR_activation_function VAR_hidden_neurons OUT_TRAIN_ACC OUT_TEST_ACC BREAK_ID TASK_ID REPEAT_ID COMBINATION_ID EXPERIMENT_ID EXPERIMENT_NAME DURATION STARTED_AT ENDED_AT MAX_TRIES TRIES CLUSTER_ID CLUSTER_NAME NODE_ID NODE_NAME WORKER_ID OUTPUT_DIR WORK_DIR
sigmoid 10 0.937 0.923 0 1 1 0 0 quick_experiment 0.018904 2023-05-09 18:05:18.012184 2023-05-09 18:05:18.031088 3 1 0 quick_cluster 0 local_machine 16 /home/ubuntu/Sources/patas/tmp/quick_experiment/1 $HOME/Sources/patas/examples/sanity
sigmoid 20 0.927 0.916 0 8 2 2 0 quick_experiment 0.025697 2023-05-09 18:05:18.008116 2023-05-09 18:05:18.033813 3 1 0 quick_cluster 0 local_machine 9 /home/ubuntu/Sources/patas/tmp/quick_experiment/8 $HOME/Sources/patas/examples/sanity
sigmoid 30 0.921 0.912 0 13 1 4 0 quick_experiment 0.027901 2023-05-09 18:05:18.005924 2023-05-09 18:05:18.033825 3 1 0 quick_cluster 0 local_machine 4 /home/ubuntu/Sources/patas/tmp/quick_experiment/13 $HOME/Sources/patas/examples/sanity
relu 10 0.942 0.926 0 5 2 1 0 quick_experiment 0.027528 2023-05-09 18:05:18.010084 2023-05-09 18:05:18.037612 3 1 0 quick_cluster 0 local_machine 12 /home/ubuntu/Sources/patas/tmp/quick_experiment/5 $HOME/Sources/patas/examples/sanity
relu 30 0.923 0.913 0 17 2 5 0 quick_experiment 0.020922 2023-05-09 18:05:18.004639 2023-05-09 18:05:18.025561 3 1 0 quick_cluster 0 local_machine 0 /home/ubuntu/Sources/patas/tmp/quick_experiment/17 $HOME/Sources/patas/examples/sanity
relu 30 0.923 0.913 0 15 0 5 0 quick_experiment 0.02699 2023-05-09 18:05:18.005428 2023-05-09 18:05:18.032418 3 1 0 quick_cluster 0 local_machine 2 /home/ubuntu/Sources/patas/tmp/quick_experiment/15 $HOME/Sources/patas/examples/sanity
sigmoid 30 0.921 0.912 0 14 2 4 0 quick_experiment 0.019452 2023-05-09 18:05:18.005693 2023-05-09 18:05:18.025145 3 1 0 quick_cluster 0 local_machine 3 /home/ubuntu/Sources/patas/tmp/quick_experiment/14 $HOME/Sources/patas/examples/sanity
sigmoid 10 0.937 0.923 0 0 0 0 0 quick_experiment 0.021054 2023-05-09 18:05:18.012634 2023-05-09 18:05:18.033688 3 1 0 quick_cluster 0 local_machine 17 /home/ubuntu/Sources/patas/tmp/quick_experiment/0 $HOME/Sources/patas/examples/sanity
relu 20 0.929 0.917 0 11 2 3 0 quick_experiment 0.021986 2023-05-09 18:05:18.006812 2023-05-09 18:05:18.028798 3 1 0 quick_cluster 0 local_machine 6 /home/ubuntu/Sources/patas/tmp/quick_experiment/11 $HOME/Sources/patas/examples/sanity
relu 30 0.923 0.913 0 16 1 5 0 quick_experiment 0.028839 2023-05-09 18:05:18.004890 2023-05-09 18:05:18.033729 3 1 0 quick_cluster 0 local_machine 1 /home/ubuntu/Sources/patas/tmp/quick_experiment/16 $HOME/Sources/patas/examples/sanity
relu 20 0.929 0.917 0 9 0 3 0 quick_experiment 0.021226 2023-05-09 18:05:18.007595 2023-05-09 18:05:18.028821 3 1 0 quick_cluster 0 local_machine 8 /home/ubuntu/Sources/patas/tmp/quick_experiment/9 $HOME/Sources/patas/examples/sanity
relu 10 0.942 0.926 0 3 0 1 0 quick_experiment 0.021257 2023-05-09 18:05:18.011202 2023-05-09 18:05:18.032459 3 1 0 quick_cluster 0 local_machine 14 /home/ubuntu/Sources/patas/tmp/quick_experiment/3 $HOME/Sources/patas/examples/sanity
sigmoid 20 0.927 0.916 0 7 1 2 0 quick_experiment 0.021902 2023-05-09 18:05:18.008670 2023-05-09 18:05:18.030572 3 1 0 quick_cluster 0 local_machine 10 /home/ubuntu/Sources/patas/tmp/quick_experiment/7 $HOME/Sources/patas/examples/sanity
sigmoid 10 0.937 0.923 0 2 2 0 0 quick_experiment 0.019055 2023-05-09 18:05:18.011684 2023-05-09 18:05:18.030739 3 1 0 quick_cluster 0 local_machine 15 /home/ubuntu/Sources/patas/tmp/quick_experiment/2 $HOME/Sources/patas/examples/sanity
relu 10 0.942 0.926 0 4 1 1 0 quick_experiment 0.053797 2023-05-09 18:05:18.010456 2023-05-09 18:05:18.064253 3 1 0 quick_cluster 0 local_machine 13 /home/ubuntu/Sources/patas/tmp/quick_experiment/4 $HOME/Sources/patas/examples/sanity
sigmoid 30 0.921 0.912 0 12 0 4 0 quick_experiment 0.020014 2023-05-09 18:05:18.006314 2023-05-09 18:05:18.026328 3 1 0 quick_cluster 0 local_machine 5 /home/ubuntu/Sources/patas/tmp/quick_experiment/12 $HOME/Sources/patas/examples/sanity
relu 20 0.929 0.917 0 10 1 3 0 quick_experiment 0.027261 2023-05-09 18:05:18.007250 2023-05-09 18:05:18.034511 3 1 0 quick_cluster 0 local_machine 7 /home/ubuntu/Sources/patas/tmp/quick_experiment/10 $HOME/Sources/patas/examples/sanity
sigmoid 20 0.927 0.916 0 6 0 2 0 quick_experiment 0.020901 2023-05-09 18:05:18.009292 2023-05-09 18:05:18.030193 3 1 0 quick_cluster 0 local_machine 11 /home/ubuntu/Sources/patas/tmp/quick_experiment/6 $HOME/Sources/patas/examples/sanity

TL;DR 💻

# Parallelizing a program 
patas exec grid \
    --cmd './main.py {hidden_neurons} {activation_function}' \
    --vl hidden_neurons 10 20 30 \
    --vl activation_function sigmoid relu \
    --workdir '$HOME/Sources/patas/examples/sanity' \
    --repeat 2 \
    -o tmp

# Parsing the program output
patas parse \
    -e 'tmp/quick_experiment/' \
    -p TRAIN_ACC  'Train accuracy: (@float@)' \
    -p TEST_ACC   'Test accuracy:  (@float@)'

Source Code 🎼

The source code is available in the project's repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

patas-2.0.0.tar.gz (21.7 kB view hashes)

Uploaded Source

Built Distribution

patas-2.0.0-py3-none-any.whl (21.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page