Utils for interfacing to MPI libraries using mpi4py and dask
Project description
Utils for interfacing to MPI libraries using mpi4py and dask
This package provides tools for interfacing to MPI libraries based on mpi4py
and dask
:
-
mpi4py is a complete Python API of the MPI standard.
-
dask is a flexible library for parallel computing in Python. Particurally, we use the following sub-modules of the latter:
-
dask.distributed for distributed computing,
-
dask.array for distributed numpy arrays and
-
dask_mpi for interfacing to MPI.
-
Installation
NOTE: lyncs_mpi requires a working MPI installation.
This can be installed via apt-get
:
sudo apt-get install libopenmpi-dev openmpi-bin
OR using conda
:
conda install -c anaconda mpi4py
The package can be installed via pip
:
pip install [--user] lyncs_mpi
Documentation
In this package we implement several low-level tools for supporting classes distributed over MPI. These are described in this guide for developers. In the following we describe the high-level tools provided in this package.
Client
The Client is a wrapper of dask.distributed.Client
made MPI compatible following the instructions
of dask-mpi documentation.
from lyncs_mpi import Client
client = Client(num_workers=4)
If the above script is run in a interactive shell, the Client will start an MPI server in the background
running over num_workers+1
processes. The workers are the effective processes involved in the calculation.
The extra process (+1) is the scheduler that will manage the task scheduling.
The client, the interactive shell in this example, will proceed processing the script: submitting tasks to the scheduler that will run them on the workers.
The same script can be run directly via mpirun
. In this case one needs to execute
mpirun -n $((num_workers + 2)) python script.py
that will run on num_workers+2
processes (as above +1 for the scheduler and +1 for the client that processes the script).
Communicators
Another feature that make lyncs_mpi.Client
MPI compatible is the support of MPI communicators.
comm = client.comm
comm1 = client.create_comm(num_workers=2)
comm2 = client.create_comm(exclude=comm1.workers)
In the example, comm = client.comm
is the main MPI communicator involving all the workers.
The second comm1
and third comm2
communicators, instead, are communicators over 2 workers each.
The first two workers have been optimally chosen by the client, the other two instead are the remaining
one excluding the workers of comm1
.
Another kind of communicators are Cartesian MPI communicators. They can be initialized doing
cart = comm.create_cart([2,2])
where [2,2]
are the dimensions of the multi-dimensional grid where the processes are distributed.
Cartesian communicators directly support Dask arrays
and e.g. cart.zeros([4,4,3,2,1])
instantiates a distributed Dask array assigned to the workers
of the communicator with local shape (chunks) (2,2,3,2,1)
.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file lyncs_mpi-0.1.4.tar.gz
.
File metadata
- Download URL: lyncs_mpi-0.1.4.tar.gz
- Upload date:
- Size: 18.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.50.1 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80bf71cbf93d88e339af3f58255213e597e3efc61d2651280f4c4edb93b5dfe4 |
|
MD5 | 701ed4f05c7911d6d4dda0ab75c11358 |
|
BLAKE2b-256 | 53a1f89d8529d3d84da6f8cb8af68ba43c80dfc69bc01ea2e1d66541bb176405 |