A caching qstat that reduces load on the PBS server
Project description
qstat-cache
A cached version of the PBS Pro qstat command that reduces load on the scheduler's database
Details
Most users run the qstat command at reasonable intervals and things work well. However, with the advent of workflow managers more users are running qstat at frequencies much too high for current versions of PBS Pro to support well. This utility creates a simple text-based cache of common qstat output and provides a script to serve that data to users. If an option is not cached (e.g., -Q output), the query is sent to PBS's version of qstat for processing. Usage:
usage: qstat [-h] [-1] [-a] [-D DELIMITER] [-f] [-F {json,dsv}]
[--format FORMAT] [-H] [-J] [--noheader] [-n] [-s]
[--status STATUS] [-t] [-T] [-u USER] [-w] [-x]
[filters ...]
This command provides a lightweight alternative to qstat. Data are queried and
updated every minute from the PBS job scheduler. Options not listed here will
be forwarded to the scheduler. Please use those options sparingly. If a
destination is provided, it should be a valid execution queue on the chosen
server.
options:
-h, --help show this help message and exit
Supported Original Options:
filters job IDs or queues
-1 display node or comment information on job line
-a display all jobs (default unless -f specified)
-D DELIMITER specify a delimiter if using -Fdsv (default = '|')
-f display full output for a job
-F {json,dsv} full output (-f) in custom format
-H all moved or finished jobs / specific job of any state
-J only show information for jobs (or subjobs with -t)
-n display a list of nodes at the end of the line
-s display administrator comment on the next line
-t show information for both jobs and array subjobs
-T displays estimated start time for queued jobs
-u USER filter jobs by the submitting user
-w use wide format output (120 columns)
-x all job records in recent history
Cached Version Options:
--format FORMAT column output in custom format (=help for more)
--noheader disable labels (no header)
--status STATUS filter jobs by specific single-character status code
Installation
There are two methods for installing qstat-cache - using the included
Makefile or with pip.
Makefile method
- Clone this repository on your system.
- Install at your desired path using
make install PREFIX=/path/to/qstat-cache.
pip method
- If desired, first create a virtual environment with
venvorconda/mambaand activate it. - Now run
python3 -m pip install qstat-cache.
Site setup
In either case, the following steps are required to finish configuration.
- In
$PREFIX/lib/qscache/cfgorlib/python3.x/site-packages/qscache/cfg, copy thesite.cfg.examplefile tosite.cfgand customize settings as described below. Alternatively, copy the example to<system>.cfgand then set the environment variableQSCACHE_SERVER=<system>. The latter approach allows you to cache multiple servers in a complex at the same time. - Schedule the
util/gen_data.shscript to run at regular intervals (typically every minute via a cron job). - Add the cached version of
qstatto your (and your users') environment PATH.
site.cfg settings
[paths]
# Temporary data path used by gen_data when creating
# cached output from qstat (fast file system is best)
Temp = ${install_dir}/temp/derecho
# Path where cached data will be stored and accessed
Data = ${install_dir}/data
# Optional path for logging qstat invocations
# If set, a log will be created for each user on each day
# that records calls to qstat along with arguments
# If blank, logging will be disabled
Logs = ${install_dir}/test/logs
[site]
# A custom name for your site / system. Currently this is
# only used for the section header of --help when custom
# arguments are defined.
name = Site
[filters]
# This section allows you to define a custom command-line
# argument which allows the cache to filter jobs by user-
# provided values of custom chunk resources. The argument
# will be of the form "--key", so "--cpu" for the example
# given below. The chunk resource searched for will be the
# value, so "cpu_type" below. User input will be either a
# single value or a comma-delimited inclusive list.
cpu = cpu_type
[filter_help]
# If filters are provided above, this section is used to
# define the argument help text. If this is not provided,
# qstat will report a warning to the user.
cpu = search for jobs with 1 or more chunks of cpu_type
[cache]
# The maximum wait time in seconds before the cache is
# bypassed and the real qstat is called
MaxWait = 20
# The maximum allowed age in seconds of cache data. Beyond
# this age we bypass the cache and call the true qstat
MaxAge = 300
# Delay in seconds to impose on qstat calls that bypass
# the cache due to aged data. Increasing this value can help
# the scheduler when under high load
AgeDelay = 5
# Specify the sub-minute frequency to generate data
# in seconds
Frequency = 10
# If querying data from a remote host, specify the list of
# available hosts here (space-delimited)
Hosts = login1 login2
[history]
# This section allows for some differing settings for caching
# historical data vs active job data. Typically you would
# want to use a slower frequency here since the data size
# can be large
Frequency = 60
[pbs]
# Specify the location of the actual qstat command
Qstat = /opt/pbs/bin/qstat
# Some sites may need to prefix calls to PBS with another
# command (e.g., a sudo operation). Use this variable to
# specify a prefix for PBS calls
Prefix = sudo -u adminuser
[servermap]
# Mapping of long-form server names for peer-scheduling
# user queries (use qstat -Bf to get server names)
casper = casper-pbs
derecho = desched1
[privileges]
# Enable privilege checking according to following user and
# group settings. If false, all queries allowed.
Active = True
[priv.all]
# Permit users and groups from these two lists respectively to
# see "full" job output from other users excluding the user
# environment contained in a job's Variable_List
Users = vanderwb
Groups = csgteam
[priv.env]
# Permit users and groups from these two lists to view all full
# job output, including the user environment
Users = vanderwb
Groups =
Example crontab
Here is a sample crontab that will run the gen_data script every minute
(sub-minute scheduled is recommended and enabled via the site.cfg). The idea
here is to run often enough that users and their workflows are satisfied, but
not so often that we put our own load on PBS.
Here, we use the QSCACHE_SERVER variable to specify a particular system in a
multi-server complex.
# Run qstat cache generation script every minute
# Added by Joe User on 4 Dec 2019
* * * * * QSCACHE_SERVER=sitename /path/to/qstat-cache/util/gen_data
Some sites have multiple login/admin nodes for redundency that can be used to
generate the cache. To ensure that two nodes are not caching at the same time,
add one or more hosts to the Hosts field in your site.cfg, and then use the
gen_data_remote utility script in your cron job:
# Run qstat cache generation script every minute
# Added by Joe User on 4 Dec 2019
* * * * * QSCACHE_SERVER=sitename /path/to/qstat-cache/util/gen_data_remote
Debugging
There are two environment variables you may set to assist in debugging. Setting
QSCACHE_DEBUG will cause qstat to print the age of the cache, assuming it can
be found.
If you set QSCACHE_BYPASS to true, the cache will be bypassed regardless of
which options are set, and the scheduler version of qstat will instead be
called.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file qstat_cache-3.1.1.tar.gz.
File metadata
- Download URL: qstat_cache-3.1.1.tar.gz
- Upload date:
- Size: 20.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ef360d7f092b31c49d0a8913b74024e6907df6afe3febaf221a93c1ca8cced0
|
|
| MD5 |
4fe68690814e5165e3ca8f44ed450fb7
|
|
| BLAKE2b-256 |
b7570cb1a266e6a7f08ff976c72e1c8da8a52731971cc69488dd74b42ac4de63
|
Provenance
The following attestation bundles were made for qstat_cache-3.1.1.tar.gz:
Publisher:
publish-to-pypi.yml on NCAR/qstat-cache
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
qstat_cache-3.1.1.tar.gz -
Subject digest:
2ef360d7f092b31c49d0a8913b74024e6907df6afe3febaf221a93c1ca8cced0 - Sigstore transparency entry: 945621814
- Sigstore integration time:
-
Permalink:
NCAR/qstat-cache@7da6cf07f6f22a8f3af939de42178a9670a4b847 -
Branch / Tag:
refs/tags/v3.1.1 - Owner: https://github.com/NCAR
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@7da6cf07f6f22a8f3af939de42178a9670a4b847 -
Trigger Event:
push
-
Statement type:
File details
Details for the file qstat_cache-3.1.1-py3-none-any.whl.
File metadata
- Download URL: qstat_cache-3.1.1-py3-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0be480c115ae0ca771f60df4424e6235ccfb6b08057faa1894394eef3027d191
|
|
| MD5 |
9617613a4516f0308c948b10d7ac6faa
|
|
| BLAKE2b-256 |
5c79b96e7aac22b1340bbb3750b7148aa0e4aabe6f7b83ea1e20e0610674b544
|
Provenance
The following attestation bundles were made for qstat_cache-3.1.1-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on NCAR/qstat-cache
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
qstat_cache-3.1.1-py3-none-any.whl -
Subject digest:
0be480c115ae0ca771f60df4424e6235ccfb6b08057faa1894394eef3027d191 - Sigstore transparency entry: 945621818
- Sigstore integration time:
-
Permalink:
NCAR/qstat-cache@7da6cf07f6f22a8f3af939de42178a9670a4b847 -
Branch / Tag:
refs/tags/v3.1.1 - Owner: https://github.com/NCAR
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@7da6cf07f6f22a8f3af939de42178a9670a4b847 -
Trigger Event:
push
-
Statement type: