Skip to main content

A simple yet powerfull library for Distributed Computing

Project description

HydraMPP

A massive parallel processing library for distributed processing in Python

HydraMPP is a library to make it easyer to create scalable distributed parallel processing applications.
It will function seamlessly from a single computer to a computing cluster environment with multiple nodes.

Requirements

HydraMPP is designed to be lightweight and requires little dependencies.

Python >= 3.6

Install

pip

HydraMPP can be easily installed from PyPi through pip:

pip install hydraMPP

If you don't have administrative permission and get an error, try the --user flag to install HydraMPP in your home folder.

pip install --user hydraMPP

Anaconda

HydraMPP is available through the conda-forge channel on Anaconda.

conda install -c conda-forge HydraMPP

Usage

Step 1: Import the library

The HydraMPP library can be imported in python using:

import hydraMPP

Step 2: Tag methods

Methods or functions that you would like to use with HydraMPP for parallel processing need to be tagged:

@hydraMPP.remote
def my_slow_function():
    time.sleep(10)
    return

Step 3: Initialize the connection(s)

HydraMPP can run in 3 modes:

  1. local
  2. host
  3. client

Step 4: Call your methods

Once HydraMPP has been initialized, just call the method you would like with the .remote tag and the library will queue and dispatch when enough CPUs are available either locally or on another node in your setup.

Step 5: Get return values

Use hydraMPP.wait to check the status of running jobs.
It will return two lists. The first is a list of job IDs for the jobs that have finished and the second a list of jobs in queue or still running.

Once jobs have finished running, use hydraMPP.get to get the return value and some stats on the job.
The return value of hydraMPP.get is a list with the following values:

  1. Boolean value stating if the job has finished
  2. The method name
  3. The return value
  4. Number of CPUs used for the job
  5. Time to run the job, in seconds
  6. The hostname of the node that the job ran on

Status monitor

A script is included to monitor the status of HydraMPP while it is running.

usage: hydra-status.py [-h] [address] [port]

positional arguments:
  address     Address of the HydraMPP server to get status from [127.0.0.1]
  port        Port to connect to [24515]

options:
  -h, --help  show this help message and exit

This will query the status of HydraMPP and display some information on connected clients, available CPUs, and jobs in queue.
It will immediately quit after displaying the status, for continuous monitoring use a tool like watch for this purpose.

watch -n1 hydra-status.py localhost

SLURM

HydraMPP has a built in function to utilize a SLURM environment.

All you need to do is add the flag --hydraMPP-slurm $SLURM_JOB_NODELIST when executing your python program and Hydra will take care of configuring the host/clients.

make sure to call HydraMPP.init() once all required methods have been tagged with @HydraMPP.remote

The hydraMPP-cpus can be used to set the number of CPUs for each node to use. If set to '0' or omitted then HydraMPP will try to guess the number of CPUs available on each node.

#SBATCH --job-name=My_Slurm_Job
#SBATCH --nodes=3
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=36
#SBATCH --mem=100G
#SBATCH --time=1-0
#SBATCH -o slurm-%x-%j.out

echo "====================================================="
echo "Start Time  : $(date)"
echo "Submit Dir  : $SLURM_SUBMIT_DIR"
echo "Job ID/Name : $SLURM_JOBID / $SLURM_JOB_NAME"
echo "Node List   : $SLURM_JOB_NODELIST"
echo "Num Tasks   : $SLURM_NTASKS total [$SLURM_NNODES nodes @ $SLURM_CPUS_ON_NODE CPUs/node]"
echo "======================================================"
echo ""

path/to/program.py --custom-args --hydraMPP_slurm $SLURM_JOB_NODELIST --hydraMPP-cpus $SLURM_CPUS_ON_NODE

CONTACT

The informatics point-of-contact for this project is Dr. Richard Allen White III.
If you have any questions or feedback, please feel free to get in touch by email.
Dr. Richard Allen White III
Jose Luis Figueroa III
Or open an issue.

Copyright 2024 Richard Allen White III, Jose Luis Figueroa III

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hydrampp-0.0.4.tar.gz (10.0 kB view details)

Uploaded Source

Built Distribution

hydraMPP-0.0.4-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file hydrampp-0.0.4.tar.gz.

File metadata

  • Download URL: hydrampp-0.0.4.tar.gz
  • Upload date:
  • Size: 10.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for hydrampp-0.0.4.tar.gz
Algorithm Hash digest
SHA256 7fce714e1c586d6daf2d606ff5f9074ee69007a358284eaebc95465e339779d4
MD5 27177be3132cf9e99d8d02e5b072664b
BLAKE2b-256 18755d4f90085b5148882cc6bbc03497955639d52be613e1c4fd6e52be21adec

See more details on using hashes here.

File details

Details for the file hydraMPP-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: hydraMPP-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for hydraMPP-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e8e8e9330b195cd23738f99e519564f69071d65d941f2f0e8346f9dce47b021e
MD5 ee91b5f2d0f09865093d7c497589c36d
BLAKE2b-256 427c6feb477b19bfaadda1bfdacf4311578a128972704d0c1e2c22b8f8fdc07a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page