A lightweight, single-node job scheduler written in Rust.
Project description
gflow - A lightweight, single-node job scheduler
English | 简体中文
gflow is a lightweight, single-node job scheduler written in Rust, inspired by Slurm. It is designed for efficiently managing and scheduling tasks, especially on machines with GPU resources.
Core Features
- Daemon-based Scheduling: A persistent daemon (
gflowd) manages the job queue and resource allocation. - Rich Job Submission: Supports dependencies, priorities, job arrays, and time limits via the
gbatchcommand. - Time Limits: Set maximum runtime for jobs (similar to Slurm's
--time) to prevent runaway processes. - Service and Job Control: Provides clear commands to inspect the scheduler state (
ginfo), query the job queue (gqueue), and control job states (gcancel). tmuxIntegration: Usestmuxfor robust, background task execution and session management.- Output Logging: Automatic capture of job output to log files via
tmux pipe-pane. - Simple Command-Line Interface: Offers a user-friendly and powerful set of command-line tools.
Component Overview
The gflow suite consists of several command-line tools:
gflowd: The scheduler daemon that runs in the background, managing jobs and resources.ginfo: Displays scheduler and GPU information.gbatch: Submits jobs to the scheduler, similar to Slurm'ssbatch.gqueue: Lists and filters jobs in the queue, similar to Slurm'ssqueue.gcancel: Cancels jobs and manages job states (internal use).
Installation
Install via PyPI (Recommended)
Install gflow using pipx (recommended for CLI tools):
pipx install gflow
Or using uv:
uv tool install gflow
Or using pip:
pip install gflow
This will install pre-built binaries for Linux (x86_64, ARM64, ARMv7) with both GNU and MUSL libc support.
Quick Install Script (Linux x86_64)
Install gflow with a single command:
curl -fsSL https://gflow-releases.puqing.work/install.sh | sh
Or use GitHub:
curl -fsSL https://raw.githubusercontent.com/AndPuQing/gflow/main/install.sh | sh
This will download and install the latest release binaries to ~/.cargo/bin.
You can customize the installation directory by setting the GFLOW_INSTALL_DIR environment variable:
curl -fsSL https://gflow-releases.puqing.work/install.sh | GFLOW_INSTALL_DIR=/usr/local/bin sh
Install via cargo
cargo install gflow
cargo install(main branch)
cargo install --git https://github.com/AndPuQing/gflow.git --locked
This will install all the necessary binaries (gflowd, ginfo, gbatch, gqueue, gcancel, gjob).
Install via Conda
You can install gflow using Conda from the conda-forge channel:
conda install -c conda-forge gflow
Build Manually
-
Clone the repository:
git clone https://github.com/AndPuQing/gflow.git cd gflow
-
Build the project:
cargo build --release
The executables will be available in the
target/release/directory.
Quick Start
-
Start the scheduler daemon:
gflowd upRun this in a dedicated terminal or
tmuxsession and leave it running. You can check its health at any time withgflowd statusand inspect resources withginfo. -
Submit a job: Create a script
my_job.sh:#!/bin/bash echo "Starting job on GPU: $CUDA_VISIBLE_DEVICES" sleep 30 echo "Job finished."
Submit it using
gbatch:gbatch --gpus 1 ./my_job.sh
-
Check the job queue:
gqueue
You can also watch the queue update live:
watch gqueue. -
Stop the scheduler:
gflowd downThis shuts down the daemon and cleans up the tmux session.
Usage Guide
Submitting Jobs with gbatch
gbatch provides flexible options for job submission.
-
Submit a command directly:
gbatch --gpus 1 python train.py --epochs 10
-
Set a job name and priority:
gbatch --gpus 1 --name "training-run-1" --priority 10 ./my_job.sh
-
Create a job that depends on another:
# First job gbatch --gpus 1 --name "job1" ./job1.sh # Get job ID from gqueue, e.g., 123 # Second job depends on the first gbatch --gpus 1 --name "job2" --depends-on 123 ./job2.sh
-
Set a time limit for a job:
# 30-minute limit gbatch --time 30 python train.py # 2-hour limit (HH:MM:SS format) gbatch --time 2:00:00 python long_training.py # 5 minutes 30 seconds gbatch --time 5:30 python quick_task.py
See docs/TIME_LIMITS.md for detailed documentation on time limits.
Querying Jobs with gqueue
gqueue allows you to filter and format the job list.
-
Filter by job state:
gqueue --states Running,Queued
-
Filter by job ID or name:
gqueue --jobs 123,124 gqueue --names "training-run-1"
-
Customize output format:
gqueue --format "ID,Name,State,GPUs"
Configuration
Configuration for gflowd can be customized. The default configuration file is located at ~/.config/gflow/gflowd.toml.
Star History
Contributing
If you find any bugs or have feature requests, feel free to create an Issue and contribute by submitting Pull Requests.
License
gflow is licensed under the MIT License. See LICENSE for more details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file runqd-0.4.10.tar.gz.
File metadata
- Download URL: runqd-0.4.10.tar.gz
- Upload date:
- Size: 301.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5fb132f1904828948eaa9890fe5f2d43e87e33d5eda86758fd8689f20b92389d
|
|
| MD5 |
ccfdff120c9b4934dffc14d2388ddb4b
|
|
| BLAKE2b-256 |
3c18ecf9fe08323ffe7c1c19e54a4586e00f461c9e356c028d9479ff4db3ba98
|
File details
Details for the file runqd-0.4.10-py3-none-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: runqd-0.4.10-py3-none-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 12.2 MB
- Tags: Python 3, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
65d8b2db43023885e68cc04c1377db65e2c8b53c03d87608d535a20350855ef4
|
|
| MD5 |
1c31a6e1e944565848c864bca8f77ae2
|
|
| BLAKE2b-256 |
6f6f8c14f74b18de1ddbb1f7c89d2f4d2f07efa285379e2fd9e077754607c293
|
File details
Details for the file runqd-0.4.10-py3-none-musllinux_1_2_aarch64.whl.
File metadata
- Download URL: runqd-0.4.10-py3-none-musllinux_1_2_aarch64.whl
- Upload date:
- Size: 11.4 MB
- Tags: Python 3, musllinux: musl 1.2+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
886d292ecd298507961461e8a069cb1be7498c82471b1e6d7e31b2ece9abc7d0
|
|
| MD5 |
6da1de627258f1e2f31e2105d0cbab9f
|
|
| BLAKE2b-256 |
2dd277e865503e1612237615eaa7c31eb9373793909749818d7828f713d9bd14
|
File details
Details for the file runqd-0.4.10-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: runqd-0.4.10-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 11.7 MB
- Tags: Python 3, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e88c612976ff6dc550ae4095cadb41ba981d7fe18f01d88baf5d1bc04bd1581
|
|
| MD5 |
1069d17f4ed1ae9a108e07fa5a06b83f
|
|
| BLAKE2b-256 |
5b964aab76e5f519234519046864f073cdb3c5f3f922578f6702d10a69b90d68
|
File details
Details for the file runqd-0.4.10-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: runqd-0.4.10-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 11.1 MB
- Tags: Python 3, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9e474d161cd9f386bb43181ac8b9baec2b4e6f808b87e07e44ce35eadf16f85
|
|
| MD5 |
7947143be3725ac54bf9e4477cef7d36
|
|
| BLAKE2b-256 |
a4dd34ec0c38d5c8bb97eb9b737bcd66baa020829b27e303f2ed52dd8f7d31fe
|