Wait for a free GPU, claim it, and run a command on it.
Project description
gpu-gate
Wait for a free GPU, claim it, set CUDA_VISIBLE_DEVICES, and run your command.
On a shared multi-GPU box without a cluster scheduler, starting a job usually
means watching nvidia-smi, picking a card by hand, exporting the env var, and
remembering to actually launch. gpu-gate is the small wait-pick-export-run
loop that does this for you, with a cooperative lock so two invocations on the
same host do not grab the same just-freed card. No daemon, no server, nothing
to administer.
$ gpu-gate run --min-free-mb 8000 -- python train.py
gpu-gate: waiting for a free GPU ...
# ... blocks until a card has >= 8 GB free, then runs train.py with
# CUDA_VISIBLE_DEVICES set to the chosen index
Install
$ pip install gpu-gate # from PyPI, once released
$ pip install git+https://github.com/jmweb-org/gpu-gate # latest, available now
It requires an NVIDIA driver at run time. The NVML binding
(nvidia-ml-py) is pulled in automatically; the package still installs and
imports on machines without a GPU, so it is safe to add to shared requirements.
Usage
Run a command on a free GPU
$ gpu-gate run -n 1 --min-free-mb 8000 -- python train.py --epochs 50
Everything after -- is the command. gpu-gate blocks until the requirements
are met, claims the chosen device(s), exports CUDA_VISIBLE_DEVICES, and execs
the command. Its own exit code is the command's exit code, so it drops cleanly
into scripts and CI.
Common options:
| Option | Meaning |
|---|---|
-n, --count |
Number of GPUs to claim (default 1) |
--min-free-mb |
Require at least this much free memory |
--max-util |
Skip cards busier than this percent |
--only 0,1 |
Restrict the search to these indices |
--exclude 2,3 |
Never pick these indices |
--poll |
Seconds between checks (default 5) |
--timeout |
Give up after N seconds (exit 124) |
Just wait, then use the result yourself
$ export CUDA_VISIBLE_DEVICES=$(gpu-gate wait --min-free-mb 8000)
Inspect the current state
$ gpu-gate status
idx name free total util
0 NVIDIA L40S 44211 MiB 46068 MiB 3%
1 NVIDIA L40S 812 MiB 46068 MiB 97%
$ gpu-gate status --json
Exit codes
| Code | Meaning |
|---|---|
| 0 | Command ran (its own code is forwarded) |
| 2 | Bad invocation (for example, no command after --) |
| 124 | Timed out waiting for a GPU |
| 3 | Requirements could never be met |
| 4 | Could not read GPU state (no driver / NVML error) |
How selection works
A GPU is eligible when it has enough free memory, is below the utilization
ceiling, is not excluded, and is not currently locked by another gpu-gate
caller. Eligible cards are ranked by most free memory, then lowest
utilization, then index, and the top --count are chosen. The ordering is
fully deterministic.
Locking
While a command runs, gpu-gate holds an advisory file lock per claimed
device under $GPU_GATE_LOCK_DIR (a per-user directory by default). Other
gpu-gate invocations skip locked devices, which avoids the classic race where
two jobs both see the same card free at the same instant. The lock is advisory:
it coordinates gpu-gate callers, not arbitrary CUDA programs.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gpu_gate-0.2.0.tar.gz.
File metadata
- Download URL: gpu_gate-0.2.0.tar.gz
- Upload date:
- Size: 13.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ea0fed668208c5bca85373a4aac7c5c9131ebc8eb9e638fd1ecaeb882892cce
|
|
| MD5 |
3e8dea72ebfa0f155dbc7b5d658509ec
|
|
| BLAKE2b-256 |
1854060f9046fa039cb091e0668cf172deb5b2bb3c9e562c36c616b9a6ab8b45
|
File details
Details for the file gpu_gate-0.2.0-py3-none-any.whl.
File metadata
- Download URL: gpu_gate-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a068635d8505584559feee93df601dea2264cbaa5565f77cf45c77fb97e1e687
|
|
| MD5 |
9282f6897d9e20f6c3d33bb126e26463
|
|
| BLAKE2b-256 |
ffb56560feac9ff0ccc67a909c006930880200735b1fc2fb00ec4647d6d0cdba
|