Watch for idle GPUs and run your jobs: launches jobs in tmux, keeps logs/status and sends start/finish emails..
Project description
GPUSitter
Watch for idle GPUs and run your jobs: launches jobs in tmux, keeps logs/status and sends start/finish emails..
Features
- Real-time GPU usage monitoring
- Command-line interface, easy to integrate into workflows
- Email notifications
- Scheduled automatic job running
Dependencies
- tmux
Installation
pip install gpusitter
Usage
# One job with 1 gpu
gpust --job="python train.py"
# One job with 4 gpus
gpust --job="python train.py:4"
# Two jobs with 1 gpu and 4 gpus respectively
gpust --job="python train.py" --job="python train.py --epoch=12 --lr=-.001:4"
After starting your job, you can monitor its progress using tmux.
# List all running tmux sessions
tmux ls
# Attach to your job session (replace GPUSitter_xxx_xx with your session name)
tmux a -t GPUSitter_xxx_xx
Parameter description:
class ConfigData:
"""Configuration data for GPU Snatcher."""
gpu_free_memory_ratio_threshold: float
friendly_min: float
email_host: str
email_user: str
email_pwd: str
email_sender: str
email_receivers: list[str]
- gpu_free_memory_ratio_threshold: The minimum free GPU memory ratio required to consider a GPU available. Only GPUs with free memory above this threshold will be used.
- friendly_min: Waiting time (in seconds) before allocating GPUs. Helps prevent OOM from previous jobs.
- email_host: Email server, e.g., smtp.qq.com
- email_user: Email address
- email_pwd: SMTP authorization code
- email_sender: Sender
- email_receivers: Recipients
Contribution
Issues and pull requests are welcome. Please follow the project's code style guidelines.
License
This project is licensed under the MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gpusitter-2.0.3.tar.gz.
File metadata
- Download URL: gpusitter-2.0.3.tar.gz
- Upload date:
- Size: 8.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
68c67bc229ee7c629e62cb30d8d55f2ce8dfe17d2ab8b7ae2f5126b595f27e76
|
|
| MD5 |
411df5b9621bb2a0a4743baea1a2c973
|
|
| BLAKE2b-256 |
26981c12fa96e78be9962fe29d8c9b1590a39f4ee8ff7bd2fdc5f3da2cca81bc
|
File details
Details for the file gpusitter-2.0.3-py3-none-any.whl.
File metadata
- Download URL: gpusitter-2.0.3-py3-none-any.whl
- Upload date:
- Size: 10.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4eabf6a4b5c204275564b9cb22259cddd7cd88c778ac3572a4fb7e94f4baefa
|
|
| MD5 |
5b0e16cfcbe06fd3e0c86be07b2fee1a
|
|
| BLAKE2b-256 |
22894ba02d325e17fd3852eeb7b6ae9731d3447d7afb9f0f2d5e8007d12045d0
|