Skip to main content

AD-compatible implementation of several MPI functions for pytorch tensors

Project description

mpi4torch Logo


mpi4torch is an automatic-differentiable wrapper of MPI functions for the pytorch tensor library.

MPI stands for Message Passing Interface and is the de facto standard communication interface on high-performance computing resources. To facilitate the usage of pytorch on these resources an MPI wrapper that is transparent to pytorch's automatic differentiation (AD) engine is much in need. This library tries to bridge this gap.

Installation

mpi4torch is also hosted on PyPI. However, due to the ABI-incompatibility of the different MPI implementations it is not provided as a binary wheel and needs to be built locally. Hence, you should have an appropriate C++ compiler installed, as well as the development files of your MPI library be present. The latter are usually provided through the module system of your local cluster, and you should consult the manuals of your cluster for this, or through the package manager of your Linux distribution.

Once the dependencies have been satisfied the installation can be triggered by the usual

    pip install mpi4torch

Usage

It is highly advised to first read the basic usage chapter of the documentation before jumping into action, since there are some implications of the pytorch AD design on the usage of mpi4torch. In other words, there are some footguns lurking!

You have been warned, but if you insist on an easy usage example, consider the following code snippet, which is an excerpt from examples/simple_linear_regression.py

   comm = mpi4torch.COMM_WORLD

   def lossfunction(params):
       # average initial params to bring all ranks on the same page
       params = comm.Allreduce(params, mpi4torch.MPI_SUM) / comm.size

       # compute local loss
       localloss = torch.sum(torch.square(youtput - some_parametrized_function(xinput, params)))

       # sum up the loss among all ranks
       return comm.Allreduce(localloss, mpi4torch.MPI_SUM)

Here we have parallelized a loss function simply by adding two calls to Allreduce. For a more thorough discussion of the example see here.

Tests

Running tests is as easy as

    mpirun -np 2 nose2

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mpi4torch-0.1.0.tar.gz (18.2 kB view details)

Uploaded Source

File details

Details for the file mpi4torch-0.1.0.tar.gz.

File metadata

  • Download URL: mpi4torch-0.1.0.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.10.4

File hashes

Hashes for mpi4torch-0.1.0.tar.gz
Algorithm Hash digest
SHA256 395b62285f06a9d1a8b87d3b14f981732e3564947df87b6e01d5ebfade8afa6d
MD5 7d47546be195e5e80db184da7ae02d20
BLAKE2b-256 898f2dc6b5a1e61ff1dabdd5aa4add9c0ba43e254e0f4db9f6ee7bf843805d0a

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page