Skip to main content

No project description provided

Project description

Description

mympingpong is a mpi4py based random pair pingpong network stress test.

Installation

We recommend using [EasyBuild][eb_url] to automatically install mympingpong and it's dependencies.

The required steps involve building a patched mpi4py and a parallel enabled h5py. Instructions on manual installing these can be found in manual/install_insructions.

Usage

After successfull installation, simply submit a job that runs

mympirun mympingpong -f output_dir -i nr_iterations -n nr_tests_per_rank

Dependencies

(including but not limited to)

  • numpy >= 1.8.2
  • vsc-base >= 1.8.6
  • matplotlib >= 1.3.1
  • h5py >= 2.5.0

Examples

The end result of a mympingponganalysis is a visual representation of the pingpong Round Trip Time (RTT) of all the pairs. This can give insight in the architecture and/or topology (or any issues with it).

Example output

each plot graph up to 5 graphs

  • largest plot: each datapoint is the average pingpong RTT between pairs (x,y), where x and y are the MPI ranks.
  • histogram of all pingpong RTT
  • heatmap of number of pingpongs ran on the (x,y) pair
  • heatmap of the standard deviation in the data from running tests on the (x,y) pair
  • if a mask as been used, a histogram of all pingpong RTT in the mask interval

Every MPI rank on a unique node

The result of running pingpong on 128 nodes, with each rank pinned to core 0. The latency graph clearly shows which ranks are located on the same switch. The standard deviation graph show that something fishy is going on with ranks 32-48.

Every MPI rank on a unique core

The result of running pingpong on 4 nodes with 16 cores per node. In this example the NUMA nodes are visible. Inter node communication is clearly slower then intra node, but only by a factor of 3-5. The histogram shows 3 regions:shared L2 cache, on die and inter-die. On the latency graph the switch is also visible as a greenish shade.

Oversubscribing

The result of running 32 ranks per node on 4 nodes with 16 cores per node

Using PingPong to its fullest potential

You should always take care to have enough samples per pair. In other words, the -n parameter should be high enough to ensure every rank has a consistent result. A quick way to see if consistent results are achieved, is when the pair samples graph has a deep red color

Knowing that there is a problem might be useful, but you're more than like also going to want to know where the problem is located. Information on what rank is pinned to what core on which node is present in the outputfile, but this data is not plotted with mympingponganalysis. Open it with h5dump or any other HDF5 file reader to get access to this data.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mympingpong-0.8.1.tar.gz (46.4 kB view details)

Uploaded Source

File details

Details for the file mympingpong-0.8.1.tar.gz.

File metadata

  • Download URL: mympingpong-0.8.1.tar.gz
  • Upload date:
  • Size: 46.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.9.0

File hashes

Hashes for mympingpong-0.8.1.tar.gz
Algorithm Hash digest
SHA256 118605f8d458c5de289af5a80dcb69904907f638e3b3635b5dec056a6a650b1b
MD5 570542ccbeb0d29d9fad9f19c1e18160
BLAKE2b-256 6572b0892278046bf4cfe38f73a5d02789ca4f5547d773f17afae4d6e86cff0d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page