Skip to main content

A simple tool to schedule gpu resources.

Project description

GPU Scheduler

A simple tool to schedule gpu resources.

Let's say you have a server with 4 GPUs and you want to run 10 jobs (each job requires 1 GPU). They can finish any time and you don't want to sit in front of the server to run one after another, You can use this tool to schedule the jobs and it will run the jobs as soon as needed gpu is available.

Installation

pip install gpu-scheduler

Usage

import time
from gpu_scheduler import GPUScheduler

def hf_with_given_gpu_ids(model_id, gpu_ids: list):
    device_map = {str(i): f"cuda:{gpu_id}" for i, gpu_id in enumerate(gpu_ids)}
    model = AutoModel.from_pretrained(model_id, device_map=device_map)
    # ...
    return model

def func(model_id, gpu_ids: list):
    """
    The function must accept gpu_ids as an argument
    This is the list of GPU IDs that the job will run on
    You have to manually set the GPU IDs in your code, i.e., above `hf_with_given_gpu_ids` function
    """
    time.sleep(4)  # Simulate job running
    return model_id, gpu_ids

if __name__ == "__main__":
    # Initialize scheduler with available GPUs
    scheduler = GPUScheduler([0, 1, 2, 3])  # 4 GPUs numbered 0-3

    # Add example model training jobs with different GPU requirements
    scheduler.add_job(0, func, num_gpus=1, model_id="model_small")
    scheduler.add_job(1, func, num_gpus=2, model_id="model_medium")
    scheduler.add_job(2, func, num_gpus=4, model_id="model_large")
    scheduler.add_job(3, func, num_gpus=1, model_id="model_small_2")
    scheduler.add_job(4, func, num_gpus=2, model_id="model_medium_2")
    scheduler.add_job(5, func, num_gpus=3, model_id="model_large_2")

    # This job will fail because it requires more GPUs than are available
    scheduler.add_job(999, func, num_gpus=5, model_id="model_too_big")

    # Start processing jobs
    scheduler.start_scheduler()

    print("Job results:")
    for job_id, result in scheduler.results:
        print(f"{job_id=}: {result=}")

Output

Job results:
job_id=999: result='Error: Job 999 requires 5 GPUs, but only 4 are available in total'
job_id=1: result=('model_medium', [1, 2])
job_id=0: result=('model_small', [0])
job_id=3: result=('model_small_2', [3])
job_id=5: result=('model_large_2', [1, 2, 0])
job_id=2: result=('model_large', [3, 1, 2, 0])
job_id=4: result=('model_medium_2', [3, 1])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpu_scheduler-0.1.0.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gpu_scheduler-0.1.0-py3-none-any.whl (3.8 kB view details)

Uploaded Python 3

File details

Details for the file gpu_scheduler-0.1.0.tar.gz.

File metadata

  • Download URL: gpu_scheduler-0.1.0.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.6.0

File hashes

Hashes for gpu_scheduler-0.1.0.tar.gz
Algorithm Hash digest
SHA256 87d72c190fe92e3acb5b3fc3a21b82e08339d7c311b2a3f2f4e8369fb09030c6
MD5 bad37b92d7540187ce60c83696b7eb02
BLAKE2b-256 f089596b4f3160019b7da76a09e54bf9d518058e0c50d737aa9674882e444fed

See more details on using hashes here.

File details

Details for the file gpu_scheduler-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for gpu_scheduler-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6d20f01b0177f919913da5b6d4b9764b2d2b987f537b477846afe673022ca96c
MD5 c37062622794d9a068645445b8d5a3a1
BLAKE2b-256 b2f151c487fe6255bc073759166fc35685455b157f7b55a6e7d0ce50e54dd9e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page