ParaCuda - Parallel Execution of CUDA hyperparameter sweeps in Python for pytorch and beyond.
Project description
ParaCuda
Overview
ParaCuda is a parallel cuda execution framework for arbitrary tools. Consider it a (very) poor man's version of workflow management systems. ParaCuda uses a config file that describes the script that needs to be run and the parameters that we want to pass down to the script. ParaCuda then logs the execution and distributes it over all specified GPUs.
Motivation
In machine learning, or modern bioinformatics we often require GPUs for execution of processes or predictions. Sometimes, this is extremely trivial and a simple solution is desired that can run a script parameterized in different ways across all available GPUs. ParaCuda addresses this challenge by easy set-up and execution of experiments concurrently.
Installation
For the moment, clone the repository and create the environment using mamba.
git clone https://github.com/gieses/paracuda.git
mamba env create -f environment.yml
Usage
Usage is very simple, use the paracuda
command together with a configuration file (see below).
usage: paracuda [-h] --config CONFIG --gpus GPUS [--control_dir CONTROL_DIR] [--log-level {TRACE,DEBUG,INFO,SUCCESS,WARNING,ERROR,CRITICAL}] [--dry-run]
Run parallel grid search across multiple GPUs
options:
-h, --help show this help message and exit
--config CONFIG Path to JSON configuration file (default: None)
--gpus GPUS Number of GPUs to use (default: None)
--control_dir CONTROL_DIR
Directory to store control files (default: control_dir)
--log-level {TRACE,DEBUG,INFO,SUCCESS,WARNING,ERROR,CRITICAL}
Set the logging level (default: INFO)
--dry-run Only print commands without executing them (default: False)
Example
Please check the example directory for a simple demonstration of how to write the config and the python script. Note, that per default we generate all possible hyperparameter combinations from the config file to run the defined script.
{
"base_command": "python example/example_script.py",
"output_dir": "results_directory",
"param_grid": {
"number": [1, 2, 3]
}
}
paracuda --config example/example_config.json --gpus 2 --dry-run
Returns:
2025-06-18 23:48:35 | INFO | ๐ DRY RUN MODE: Commands will be printed but not executed
2025-06-18 23:48:35 | INFO | ๐ Starting grid search with 2 GPUs
2025-06-18 23:48:35 | SUCCESS | ๐ Successfully loaded configuration from example/example_config.json
2025-06-18 23:48:35 | INFO | ๐งฎ Generated 3 parameter combinations
2025-06-18 23:48:35 | INFO | Storing results in: control_dir
2025-06-18 23:48:37 | INFO | โ๏ธ Starting parallel execution
โณ Starting... | โ
0 | โ 0: 0%| | 0/3 [00:02<?]
2025-06-18 23:48:40.455 | INFO | paracuda.paracuda_run:run_task:148 - ๐ [DRY RUN] Task 1 on GPU 0: c5b39f71
2025-06-18 23:48:40.455 | INFO | paracuda.paracuda_run:run_task:149 - ๐ Command: CUDA_VISIBLE_DEVICES=0 python example/example_script.py --number 1 > control_dir/c5b39f7159270a92961b45aaf8ea44fa.log 2>&1
2025-06-18 23:48:40.455 | INFO | paracuda.paracuda_run:run_task:148 - ๐ [DRY RUN] Task 2 on GPU 1: e360de0c
2025-06-18 23:48:40.455 | INFO | paracuda.paracuda_run:run_task:149 - ๐ Command: CUDA_VISIBLE_DEVICES=1 python example/example_script.py --number 2 > control_dir/e360de0cffb59c8b5711943068cee5e2.log 2>&1
2025-06-18 23:48:40.456 | INFO | paracuda.paracuda_run:run_task:148 - ๐ [DRY RUN] Task 3 on GPU 0: ba21f795
2025-06-18 23:48:40.456 | INFO | paracuda.paracuda_run:run_task:149 - ๐ Command: CUDA_VISIBLE_DEVICES=0 python example/example_script.py --number 3 > control_dir/ba21f7957fd020b1c4473607144bf4ac.log 2>&1
โณ Task 3 on GPU 0: ba21f795 | โ
3 | โ 0: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 3/3 [00:02<00:00]
2025-06-18 23:48:40 | SUCCESS | โจ Dry run completed! Printed 3 commands
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file paracuda-0.3.0.tar.gz
.
File metadata
- Download URL: paracuda-0.3.0.tar.gz
- Upload date:
- Size: 695.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
35608c3a3ddb31387d513af512babf35ea71e9b4ddedd57e97605ce324d3e280
|
|
MD5 |
22d13554cb737d844b64f7b6e22681b5
|
|
BLAKE2b-256 |
f5f696eb93b85892f2afb11e3a580553f1dee7d5fc8a4b7c11a4f1c817dd5ff4
|
File details
Details for the file paracuda-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: paracuda-0.3.0-py3-none-any.whl
- Upload date:
- Size: 10.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
5c6ab680b8503ce1d638cf2db9183aaec96870b83f17e2a3d6ba234d8ba74ad6
|
|
MD5 |
ed220e254871649173986845b0eb9223
|
|
BLAKE2b-256 |
0a496abb7c9c5f8a269ef378985f958698771772acb9a9f583c074b4e1c9e837
|