Skip to main content

map and starmap implementations passing additional arguments and parallelizing if possible

Project description

This small python module implements two functions: map and starmap.

What does parmap offer?

  • Provide an easy to use syntax for both map and starmap.

  • Parallelize transparently whenever possible.

  • Handle multiple (positional -for now-) arguments as needed.

Installation:

pip install parmap

Usage:

Here are some examples with some unparallelized code parallelized with parmap:

import parmap
# You want to do:
y = [myfunction(x, argument1, argument2) for x in mylist]
# In parallel:
y = parmap.map(myfunction, mylist, argument1, argument2)

# You want to do:
z = [myfunction(x, y, argument1, argument2) for (x,y) in mylist]
# In parallel:
z = parmap.starmap(myfunction, mylist, argument1, argument2)

# You want to do:
listx = [1, 2, 3, 4, 5, 6]
listy = [2, 3, 4, 5, 6, 7]
param = 3.14
param2 = 42
listz = []
for (x, y) in zip(listx, listy):
    listz.append(myfunction(x, y, param1, param2))
# In parallel:
listz = parmap.starmap(myfunction, zip(listx, listy), param1, param2)

map (and starmap on python 3.3) already exist. Why reinvent the wheel?

Please correct me if I am wrong, but from my point of view, existing functions have some usability limitations:

  • The built-in python function map [1] is not able to parallelize.

  • multiprocessing.Pool().starmap [2] is only available in python-3.3 and later versions.

  • multiprocessing.Pool().map [3] does not allow any additional argument to the mapped function.

  • multiprocessing.Pool().starmap allows passing multiple arguments, but in order to pass a constant argument to the mapped function you will need to convert it to an iterator using itertools.repeat(your_parameter) [4]

parmap aims to overcome this limitations in the simplest possible way.

Additional features in parmap:

  • Create a pool for parallel computation automatically if possible.

  • parmap.map(..., ..., parallel=False) # disables parallelization

  • parmap.map(..., ..., chunksize=3) # size of chunks (see multiprocessing.Pool().map)

  • parmap.map(..., ..., pool=multiprocessing.Pool()) # use an existing pool

To do:

Pull requests and suggestions are welcome.

  • See if anyone is interested on this

  • Pass keyword arguments to functions?

  • Improve exception handling

  • Sphinx documentation?

Acknowledgments:

The original idea for this implementation was given by J.F. Sebastian at http://stackoverflow.com/a/5443941/446149

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parmap-1.2.0.tar.gz (7.7 kB view details)

Uploaded Source

File details

Details for the file parmap-1.2.0.tar.gz.

File metadata

  • Download URL: parmap-1.2.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for parmap-1.2.0.tar.gz
Algorithm Hash digest
SHA256 0e625b8c026a665e5339b954496ed71095d8e48d9c814fc98156c1d49c26bad8
MD5 0df2bc147c852a2b116f7bce3eadab03
BLAKE2b-256 180c062dfee2e27e0feb7f03b00a0f94bdbee86f5bb7f81f1ce8fad24bfcc91c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page