Parallel Processing Coordinator. Splits dataset processing to run parallel cluster jobs and aggregates outputs
Project description
ParProcCo
Requires a YAML configuration file in grandparent directory of package, CONDA_PREFIX/etc or /etc
--- !PPCConfig
allowed_programs:
rs_map: msmapper_utils
blah1: whatever_package1
blah2: whatever_package2
url: https://slurm.local:8443
extra_property_envs: # optional mapping of properties to pass to Slurm's JobDescMsg
account: MY_ACCOUNT # env var that holds account
comment: mega job
valid_top_directories: # optional mapping of top directories accessible from cluster nodes
# (used to check job scripts, log and working directories)
- /cluster_home
- /cluster_apps
An entry point called ParProcCo.allowed_programs
can be added to other packages' setup.py
:
setup(
...
entry_points={PPC_ENTRY_POINT: ['blah1 = whatever_package1']},
)
which will look for a module called blah1_wrapper
in whatever_package1
package.
Testing
Tests can be run with
$ pytest tests
To test interactions with Slurm, set the following environment variables:
SLURM_REST_URL # URL for server and port where the REST endpoints are hosted
SLURM_PARTITION # Slurm cluster parition
SLURM_JWT # JSON web token for access to REST endpoints
The environment can be set up and managed by running the create_env
task in VSCode. This will read the token from
~/.ssh/slurm.tkn
but will not check or generate the key. The resulting file .vscode/.env
is used by the
python.envFile
setting to propagate these values automatically.
On the initial run, SLURM_REST_URL
and SLURM_PARTITION
will need to be given values manually (unless already set as
environment variables). Those values will be kept whenever the task is rerun, with only the token being updated. As
.vscode/.env
is ignored by git, it is safe to save these values in that file.
If you are not using VSCode, running .vscode/create_env.sh
will create the env file, and the variables can be exported
using:
set -a
source ".vscode/.env"
set +a
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file parprocco-2.1.2.tar.gz
.
File metadata
- Download URL: parprocco-2.1.2.tar.gz
- Upload date:
- Size: 57.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa4691c0c1de7cbd5db9411a2a935a7da6f37d4d70e41a3e4175229f314f2c66 |
|
MD5 | dd3b529794296b7890e34ded240071b9 |
|
BLAKE2b-256 | 0810681cb90c0c9cd9fd6baa8da8082ca9dedc5068a8f5d2e3ace4d8adf05685 |
File details
Details for the file ParProcCo-2.1.2-py3-none-any.whl
.
File metadata
- Download URL: ParProcCo-2.1.2-py3-none-any.whl
- Upload date:
- Size: 48.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a760bc2cee1716527a01fffacf3dfafe21314cf6ad3304020bafc22156637a9 |
|
MD5 | fbd41c0de162e96b9ed044c692424ce9 |
|
BLAKE2b-256 | f17b7761e496623562f3dada9c9c754bff05e84212a002b25e40f4b0876dba14 |