Skip to main content

watchmen for GPU scheduling

Project description

watchmen

A simple and easy-to-use toolkit for GPU scheduling.

Dependencies

  • Python >= 3.6
    • requests >= 2.24.0
    • pydantic >= 1.7.1
    • gpustat >= 0.6.0
    • flask >= 1.1.2
    • apscheduler >= 3.6.3

Installation

  1. Install dependencies.
$ pip install -r requirements.txt
  1. Install watchmen.

Install from source code:

$ pip install -e .

Or you can install the stable version package from pypi.

$ pip install gpu-watchmen -i https://pypi.org/simple

Quick Start

  1. Start the server

The default port of the server is 62333

$ python -m watchmen.server

If you want the server to be running backend, try:

$ nohup python -m watchmen.server &

There are some configurations for the server

usage: server.py [-h] [--host HOST] [--port PORT]
                 [--queue_timeout QUEUE_TIMEOUT]
                 [--request_interval REQUEST_INTERVAL]
                 [--status_queue_keep_time STATUS_QUEUE_KEEP_TIME]

optional arguments:
  -h, --help            show this help message and exit
  --host HOST           host address for api server
  --port PORT           port for api server
  --queue_timeout QUEUE_TIMEOUT
                        timeout for queue waiting (seconds)
  --request_interval REQUEST_INTERVAL
                        interval for gpu status requesting (seconds)
  --status_queue_keep_time STATUS_QUEUE_KEEP_TIME
                        hours for keeping the client status
  1. Modify the source code in your project:
client = Client(id="short description of this running", gpus=[1],
                server_host="127.0.0.1", server_port=62333)
client.wait()

When the program goes on after client.wait(), you are in the queue. You can check an example in example/single_card_mnist.py

  1. Check the queue in browser.

Open the following link to your browser: http://<server ip address>:<server port>, for example: http://192.168.126.143:62333.

And you can get a result like the demo below. Please be aware that the page is not going to change dynamically, so you can refresh the page manually to check the latest status.

Demo

UPDATE

  • v0.1.1: fix html package data

TODO

  • add reminders
  • add webui html support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpu-watchmen-0.2.2.tar.gz (6.9 kB view details)

Uploaded Source

Built Distribution

gpu_watchmen-0.2.2-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file gpu-watchmen-0.2.2.tar.gz.

File metadata

  • Download URL: gpu-watchmen-0.2.2.tar.gz
  • Upload date:
  • Size: 6.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/47.3.1.post20200622 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.7

File hashes

Hashes for gpu-watchmen-0.2.2.tar.gz
Algorithm Hash digest
SHA256 bef4d2ea71b65194ebd181b8864ddabf8b4d350e6da66bf84f8f641c29b41c73
MD5 4dd19ffe00bc584a74fb0001b23a8b7b
BLAKE2b-256 78ddbcbf37055ad80ef5e8cd6f0365777100b2fea1b2e28dbf3dfa4cdd631c72

See more details on using hashes here.

File details

Details for the file gpu_watchmen-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: gpu_watchmen-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/47.3.1.post20200622 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.7

File hashes

Hashes for gpu_watchmen-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7b831aa89d31baa3e11a858a3b5e0f83c000219e3de72b8c5b6378b1701ff93b
MD5 ad88dc7f35b236fea1661da66cde153a
BLAKE2b-256 a146891c1a67656417d67d3a5ea6ab0f635909130ff6b9b377c44d92b5c23f52

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page