Skip to main content

Python bindings for MPI

Project description

mpi4py-ve is an extension to mpi4py, which provides Python bindings for the Message Passing Interface (MPI). This package also supports to communicate array objects of NLCPy (nlcpy.ndarray) between MPI processes on x86 servers of SX-Aurora TSUBASA systems. Combining NLCPy with mpi4py-ve enables Python scripts to utilize multi-VE computing power. The current version of mpi4py-ve is based on mpi4py version 3.0.3. For details of API references, please refer to mpi4py manual.

Requirements

Before the installation, the following components are required to be installed on your x86 Node of SX-Aurora TSUBASA.

Since December 2022, mpi4py-ve has been provided as a software of NEC SDK (NEC Software Development Kit for Vector Engine). If NEC SDK on your machine has been properly installed or updated after that, mpi4py-ve is available by using /usr/bin/python3 command.

Install from wheel

You can install mpi4py-ve by executing either of the following commands.

  • Install from PyPI

    $ pip install mpi4py-ve
  • Install from your local computer

    1. Download the wheel package from GitHub.

    2. Put the wheel package to your any directory.

    3. Install the local wheel package via pip command.

      $ pip install <path_to_wheel>
The shared objects for Vector Host, which are included in the wheel package, are compiled by gcc 4.8.5 and tested by using following softwares:

NEC MPI

v2.22.0 and V3.1.0

NumPy

v1.19.2

NLCPy

v2.2.0

Install from source (with building)

Before building this package, you need to execute the environment setup script necmpivars.sh or necmpivars.csh once advance.

  • When using sh or its variant:

    $ source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh
  • When using csh or its variant:

    $ source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.csh

Here, X.X.X denotes the version number of NEC MPI.

After that, execute the following commands:

$ git clone https://github.com/SX-Aurora/mpi4py-ve.git
$ cd mpi4py-ve
$ python setup.py build --mpi=necmpi
$ python setup.py install

Example

Transfer Array

Transfers an NLCPy’s ndarray from MPI rank 0 to 1 by using comm.Send() and comm.Recv():

from mpi4pyve import MPI
import nlcpy as vp

comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

if rank == 0:
    x = vp.array([1,2,3], dtype=int)
    comm.Send(x, dest=1)

elif rank == 1:
    y = vp.empty(3, dtype=int)
    comm.Recv(y, source=0)

Sum of Numbers

Sums the numbers locally, and reduces all the local sums to the root rank (rank=0):

from mpi4pyve import MPI
import nlcpy as vp

comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

N = 1000000000
begin = N * rank // size
end = N * (rank + 1) // size

sendbuf = vp.arange(begin, end).sum()
recvbuf = comm.reduce(sendbuf, MPI.SUM, root=0)

The following table shows the performance results[msec] on VE Type 20B:

np=1

np=2

np=3

np=4

np=5

np=6

np=7

np=8

35.8

19.0

12.6

10.1

8.1

7.0

6.0

5.5

Execution

When executing Python script using mpi4py-ve, use mpirun command of NEC MPI on an x86 server of SX-Aurora TSUBASA. Before running the Python script, you need to execute the environment the following setup scripts once advance.

  • When using sh or its variant:

    $ source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.sh gnu 4.8.5
    $ source /opt/nec/ve/nlc/Y.Y.Y/bin/nlcvars.sh
  • When using csh or its variant:

    $ source /opt/nec/ve/mpi/X.X.X/bin/necmpivars.csh gnu 4.8.5
    $ source /opt/nec/ve/nlc/Y.Y.Y/bin/nlcvars.csh

Here, X.X.X and Y.Y.Y denote the version number of NEC MPI and NLC, respectively.

When using the mpirun command:

$ mpirun -veo -np N $(which python) sample.py
Here, N is the number of MPI processes that are created on an x86 server.
NEC MPI 2.21.0 or later supports the environment variable NMPI_USE_COMMAND_SEARCH_PATH.
If NMPI_USE_COMMAND_SEARCH_PATH is set to ON and the Python command path is added to the environment variable PATH, you do not have to specify with the full path.
$ export NMPI_USE_COMMAND_SEARCH_PATH=ON
$ mpirun -veo -np N python sample.py
For details of mpirun command, refer to NEC MPI User’s Guide.

Execution Examples

The following examples show how to launch MPI programs that use mpi4py-ve and NLCPy on the SX-Aurora TSUBASA.

ncore : Number of cores per VE.
a.py: Python script using mpi4py-ve and NLCPy.

  • Interactive Execution

    • Execution on one VE

      Example of using 4 processes on local VH and 4 VE processes (ncore / 4 OpenMP parallel per process) on VE#0 of local VH

      $ mpirun -veo -np 4 python a.py
    • Execution on multiple VEs on a VH

      Example of using 4 processes on local VH and 4 VE processes (1 process per VE, ncore OpenMP parallel per process) on VE#0 to VE#3 of local VH

      $ VE_NLCPY_NODELIST=0,1,2,3 mpirun -veo -np 4 python a.py

      Example of using 32 processes on local VH and 32 VE processes (8 processes per VE, ncore / 8 OpenMP parallel per process) on VE#0 to VE# 3 of local VH

      $ VE_NLCPY_NODELIST=0,1,2,3 mpirun -veo -np 32 python a.py
    • Execution on multiple VEs on multiple VHs

      Example of using a total of 32 processes on two VHs host1 and host2, and a total of 32 VE processes on VE#0 and VE#1 of each VH (8 processes per VE, ncore / 8 OpenMP parallel per process)

      $ VE_NLCPY_NODELIST=0,1 mpirun -hosts host1,host2 -veo -np 32 python a.py
  • NQSV Request Execution

    • Execution on a specific VH, on a VE

      Example of using 32 processes on logical VH#0 and 32 VE processes on logical VE#0 to logical VE#3 on logical VH#0 (8 processes per VE, ncore / 8 OpenMP parallel per process)

      #PBS -T necmpi
      #PBS -b 2 # The number of logical hosts
      #PBS --venum-lhost=4 # The number of VEs per logical host
      #PBS --cpunum-lhost=32 # The number of CPUs per logical host
      
      source /opt/nec/ve/mpi/2.22.0/bin/necmpivars.sh
      export NMPI_USE_COMMAND_SEARCH_PATH=ON
      mpirun -host 0 -veo -np 32 python a.py
    • Execution on a specific VH, on a specific VE

      Example of using 16 processes on logical VH#0, 16 VE processes in total on logical VE#0 and logical VE#3 on logical VH#0 (8 processes per VE, ncore / 8 OpenMP parallel per process)

      #PBS -T necmpi
      #PBS -b 2 # The number of logical hosts
      #PBS --venum-lhost=4 # The number of VEs per logical host
      #PBS --cpunum-lhost=16 # The number of CPUs per logical host
      
      source /opt/nec/ve/mpi/2.22.0/bin/necmpivars.sh
      export NMPI_USE_COMMAND_SEARCH_PATH=ON
      VE_NLCPY_NODELIST=0,3 mpirun -host 0 -veo -np 16 python a.py
    • Execution on all assigned VEs

      Example of using 32 processes in total on 4 VHs and using 32 VE processes in total from logical VE#0 to logical VE#7 on each of VHs (1 process per VE, ncore OpenMP parallel per process).

      #PBS -T necmpi
      #PBS -b 4 # The number of logical hosts
      #PBS --venum-lhost=8 # The number of VEs per logical host
      #PBS --cpunum-lhost=8 # The number of CPUs per logical host
      #PBS --use-hca=2 # The number of HCAs
      
      source /opt/nec/ve/mpi/2.22.0/bin/necmpivars.sh
      export NMPI_USE_COMMAND_SEARCH_PATH=ON
      mpirun -veo -np 32 python a.py

Profiling

NEC MPI provides the facility of displaying MPI communication information. There are two formats of MPI communication information available as follows:

Reduced Format

The maximum, minimum, and average values of MPI communication information of all MPI processes are displayed.

Extended Format

MPI communication information of each MPI process is displayed in the ascending order of their ranks in the communicator MPI_COMM_WORLD after the information in the reduced format.

You can control the display and format of MPI communication information by setting the environment variable NMPI_COMMINF at runtime as shown in the following table.

The Settings of NMPI_COMMINF:

NMPI_COMMINF

Displayed Information

NO

(Default) No Output

YES

Reduced Format

ALL

Extended Format

When using the mpirun command:

$ export NMPI_COMMINF=ALL
$ mpirun -veo -np N python sample.py

Use mpi4py-ve with homebrew classes (without NLCPy)

Below links would be useful to use mpi4py-ve with homebrew classes (without NLCPy):

Other Documents

Below links would be useful to understand mpi4py-ve in more detail:

Restriction

  • The current version of mpi4py-ve does not support some functions that are listed in the section “List of Unsupported Functions” of mpi4py-ve tutorial.

  • Communication of type bool between NumPy and NLCPy will fail because of the different number of bytes.

Notices

  • If you import NLCPy before calling MPI_Init()/MPI_Init_thread(), a runtime error will be raised.

    Not recommended usage:

    $ mpirun -veo -np 1 $(which python) -c "import nlcpy; from mpi4pyve import MPI"
    RuntimeError: NLCPy must be import after MPI initialization

    Recommended usage:

    $ mpirun -veo -np 1 $(which python) -c "from mpi4pyve import MPI; import nlcpy"

    MPI_Init() or MPI_Init_thread() is called when you import the MPI module from the mpi4pyve package.

  • If you use the Lock/Lock_all function for one-sided communication using NLCPy array data, you need to put in NLCPy synchronization control.

    Synchronization usage:

    import mpi4pyve
    from mpi4pyve import MPI
    import nlcpy as vp
    
    comm = MPI.COMM_WORLD
    size = comm.Get_size()
    rank = comm.Get_rank()
    
    array = vp.array(0, dtype=int)
    
    if rank == 0:
        win_n = MPI.Win.Create(array,  comm=MPI.COMM_WORLD)
    else:
        win_n = MPI.Win.Create(None, comm=MPI.COMM_WORLD)
    if rank == 0:
        array.fill(1)
        array.venode.synchronize()
        comm.Barrier()
    if rank != 0:
       comm.Barrier()
        win_n.Lock(MPI.LOCK_EXCLUSIVE, 0)
        win_n.Get([array, MPI.INT], 0)
        win_n.Unlock(0)
        assert array == 1
    comm.Barrier()
    win_n.Free()

License

The 2-clause BSD license (see LICENSE file).
mpi4py-ve is derived from mpi4py (see LICENSE_DETAIL/LICENSE_DETAIL file).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

mpi4py_ve-1.0.0-cp38-cp38-manylinux1_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.8

mpi4py_ve-1.0.0-cp37-cp37m-manylinux1_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.7m

mpi4py_ve-1.0.0-cp36-cp36m-manylinux1_x86_64.whl (2.4 MB view details)

Uploaded CPython 3.6m

File details

Details for the file mpi4py_ve-1.0.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for mpi4py_ve-1.0.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 ccd9a8119d2f696ae1164ac42ea1c78007467fefbdc52334fda78b740bda1c6a
MD5 f32c86dcbe9bdbec163b3c7502233cfd
BLAKE2b-256 a2eae8548eadcba2a54357f84675a9626f92ae0a775e40dd68cb8e49818f8944

See more details on using hashes here.

File details

Details for the file mpi4py_ve-1.0.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for mpi4py_ve-1.0.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 49eb2410de0d68f19007f7a046c9ba2a01e6df99e41d3d27ef5b34a1e8e8229c
MD5 efa49040502c17879ba2d05066d58475
BLAKE2b-256 b30dd330310d3d2ad221bdf4c8986edd5ca51293c72344d42640cdcd0f51e819

See more details on using hashes here.

File details

Details for the file mpi4py_ve-1.0.0-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: mpi4py_ve-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.4 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.15

File hashes

Hashes for mpi4py_ve-1.0.0-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 afce58beb30c75ffd3e2cb0ec35799c079f92afd578bbfdc72828dcb6cb897bf
MD5 2bdc6f50b115ad8dee142ff03afc8bd3
BLAKE2b-256 0813a1c44ee56661fff88518361f5f0ffab76cfd2695e63931d8151e68130e94

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page