A simple yet powerfull library for Distributed Computing
Project description
HydraMPP
A massive parallel processing library for distributed processing in Python
HydraMPP is a library to make it easyer to create scalable distributed parallel processing applications.
It will function seamlessly from a single computer to a computing cluster environment with multiple nodes.
Requirements
HydraMPP is designed to be lightweight and requires little dependencies.
Python >= 3.6
Install
pip
HydraMPP can be easily installed from PyPi through pip:
pip install hydraMPP
If you don't have administrative permission and get an error, try the --user flag to install HydraMPP in your home folder.
pip install --user hydraMPP
Anaconda
HydraMPP is available through the conda-forge channel on Anaconda.
conda install -c conda-forge HydraMPP
Usage
Step 1: Import the library
The HydraMPP library can be imported in python using:
import hydraMPP
Step 2: Tag methods
Methods or functions that you would like to use with HydraMPP for parallel processing need to be tagged:
@hydraMPP.remote
def my_slow_function():
time.sleep(10)
return
Step 3: Initialize the connection(s)
HydraMPP can run in 3 modes:
- local
- host
- client
Step 4: Call your methods
Once HydraMPP has been initialized, just call the method you would like with the .remote tag and the library will queue and dispatch when enough CPUs are available either locally or on another node in your setup.
Step 5: Get return values
Use hydraMPP.wait
to check the status of running jobs.
It will return two lists. The first is a list of job IDs for the jobs that have finished and the second a list of jobs in queue or still running.
Once jobs have finished running, use hydraMPP.get
to get the return value and some stats on the job.
The return value of hydraMPP.get
is a list with the following values:
- Boolean value stating if the job has finished
- The method name
- The return value
- Number of CPUs used for the job
- Time to run the job, in seconds
- The hostname of the node that the job ran on
Status monitor
A script is included to monitor the status of HydraMPP while it is running.
usage: hydra-status.py [-h] [address] [port]
positional arguments:
address Address of the HydraMPP server to get status from [127.0.0.1]
port Port to connect to [24515]
options:
-h, --help show this help message and exit
This will query the status of HydraMPP and display some information on connected clients, available CPUs, and jobs in queue.
It will immediately quit after displaying the status, for continuous monitoring use a tool like watch
for this purpose.
watch -n1 hydra-status.py localhost
SLURM
HydraMPP has a built in function to utilize a SLURM environment.
All you need to do is add the flag --hydraMPP-slurm $SLURM_JOB_NODELIST
when executing your python program and Hydra will take care of configuring the host/clients.
make sure to call HydraMPP.init()
once all required methods have been tagged with @HydraMPP.remote
The hydraMPP-cpus
can be used to set the number of CPUs for each node to use. If set to '0' or omitted then HydraMPP will try to guess the number of CPUs available on each node.
#SBATCH --job-name=My_Slurm_Job
#SBATCH --nodes=3
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=36
#SBATCH --mem=100G
#SBATCH --time=1-0
#SBATCH -o slurm-%x-%j.out
echo "====================================================="
echo "Start Time : $(date)"
echo "Submit Dir : $SLURM_SUBMIT_DIR"
echo "Job ID/Name : $SLURM_JOBID / $SLURM_JOB_NAME"
echo "Node List : $SLURM_JOB_NODELIST"
echo "Num Tasks : $SLURM_NTASKS total [$SLURM_NNODES nodes @ $SLURM_CPUS_ON_NODE CPUs/node]"
echo "======================================================"
echo ""
path/to/program.py --custom-args --hydraMPP_slurm $SLURM_JOB_NODELIST --hydraMPP-cpus $SLURM_CPUS_ON_NODE
CONTACT
The informatics point-of-contact for this project is Dr. Richard Allen White III.
If you have any questions or feedback, please feel free to get in touch by email.
Dr. Richard Allen White III
Jose Luis Figueroa III
Or open an issue.
Copyright 2024 Richard Allen White III, Jose Luis Figueroa III
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
-
Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
-
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
-
Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file hydrampp-0.0.4.tar.gz
.
File metadata
- Download URL: hydrampp-0.0.4.tar.gz
- Upload date:
- Size: 10.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7fce714e1c586d6daf2d606ff5f9074ee69007a358284eaebc95465e339779d4 |
|
MD5 | 27177be3132cf9e99d8d02e5b072664b |
|
BLAKE2b-256 | 18755d4f90085b5148882cc6bbc03497955639d52be613e1c4fd6e52be21adec |
File details
Details for the file hydraMPP-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: hydraMPP-0.0.4-py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e8e8e9330b195cd23738f99e519564f69071d65d941f2f0e8346f9dce47b021e |
|
MD5 | ee91b5f2d0f09865093d7c497589c36d |
|
BLAKE2b-256 | 427c6feb477b19bfaadda1bfdacf4311578a128972704d0c1e2c22b8f8fdc07a |