Skip to main content

Parallel Processing Coordinator. Splits dataset processing to run parallel cluster jobs and aggregates outputs

Project description

ParProcCo

Requires a YAML configuration file in grandparent directory of package, CONDA_PREFIX/etc or /etc

--- !PPCConfig
allowed_programs:
    rs_map: msmapper_utils
    blah1: whatever_package1
    blah2: whatever_package2
url: https://slurm.local:8443
extra_property_envs: # optional mapping of properties to pass to Slurm's JobDescMsg
    account: MY_ACCOUNT # env var that holds account
    comment: mega job
valid_top_directories: # optional mapping of top directories accessible from cluster nodes
                       # (used to check job scripts, log and working directories)
    - /cluster_home
    - /cluster_apps

An entry point called ParProcCo.allowed_programs can be added to other packages' setup.py:

setup(
...
    entry_points={PPC_ENTRY_POINT: ['blah1 = whatever_package1']},
)

which will look for a module called blah1_wrapper in whatever_package1 package.

Testing

Tests can be run with

$ pytest tests

To test interactions with Slurm, set the following environment variables:

SLURM_REST_URL  # URL for server and port where the REST endpoints are hosted
SLURM_PARTITION # Slurm cluster parition 
SLURM_JWT       # JSON web token for access to REST endpoints

The environment can be set up and managed by running the create_env task in VSCode. This will read the token from ~/.ssh/slurm.tkn but will not check or generate the key. The resulting file .vscode/.env is used by the python.envFile setting to propagate these values automatically.

On the initial run, SLURM_REST_URL and SLURM_PARTITION will need to be given values manually (unless already set as environment variables). Those values will be kept whenever the task is rerun, with only the token being updated. As .vscode/.env is ignored by git, it is safe to save these values in that file.

If you are not using VSCode, running .vscode/create_env.sh will create the env file, and the variables can be exported using:

set -a
source ".vscode/.env"
set +a

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parprocco-2.1.2.tar.gz (57.6 kB view details)

Uploaded Source

Built Distribution

ParProcCo-2.1.2-py3-none-any.whl (48.8 kB view details)

Uploaded Python 3

File details

Details for the file parprocco-2.1.2.tar.gz.

File metadata

  • Download URL: parprocco-2.1.2.tar.gz
  • Upload date:
  • Size: 57.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for parprocco-2.1.2.tar.gz
Algorithm Hash digest
SHA256 aa4691c0c1de7cbd5db9411a2a935a7da6f37d4d70e41a3e4175229f314f2c66
MD5 dd3b529794296b7890e34ded240071b9
BLAKE2b-256 0810681cb90c0c9cd9fd6baa8da8082ca9dedc5068a8f5d2e3ace4d8adf05685

See more details on using hashes here.

File details

Details for the file ParProcCo-2.1.2-py3-none-any.whl.

File metadata

  • Download URL: ParProcCo-2.1.2-py3-none-any.whl
  • Upload date:
  • Size: 48.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for ParProcCo-2.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2a760bc2cee1716527a01fffacf3dfafe21314cf6ad3304020bafc22156637a9
MD5 fbd41c0de162e96b9ed044c692424ce9
BLAKE2b-256 f17b7761e496623562f3dada9c9c754bff05e84212a002b25e40f4b0876dba14

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page