Skip to main content

Parallel Processing Coordinator. Splits dataset processing to run parallel cluster jobs and aggregates outputs

Project description

ParProcCo

Requires a YAML configuration file in grandparent directory of package, CONDA_PREFIX/etc or /etc

--- !PPCConfig
allowed_programs:
    rs_map: msmapper_utils
    blah1: whatever_package1
    blah2: whatever_package2
url: https://slurm.local:8443
extra_property_envs: # optional mapping of properties to pass to Slurm's JobDescMsg
    account: MY_ACCOUNT # env var that holds account
    comment: mega job
valid_top_directories: # optional mapping of top directories accessible from cluster nodes
                       # (used to check job scripts, log and working directories)
    - /cluster_home
    - /cluster_apps

An entry point called ParProcCo.allowed_programs can be added to other packages' setup.py:

setup(
...
    entry_points={PPC_ENTRY_POINT: ['blah1 = whatever_package1']},
)

which will look for a module called blah1_wrapper in whatever_package1 package.

Testing

Tests can be run with

$ pytest tests

To test interactions with Slurm, set the following environment variables:

SLURM_REST_URL  # URL for server and port where the REST endpoints are hosted
SLURM_PARTITION # Slurm cluster parition 
SLURM_JWT       # JSON web token for access to REST endpoints

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ParProcCo-2.1.0.tar.gz (56.3 kB view hashes)

Uploaded Source

Built Distribution

ParProcCo-2.1.0-py3-none-any.whl (66.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page