map and starmap implementations passing additional arguments and parallelizing if possible
Project description
parmap
======
.. image:: https://travis-ci.org/zeehio/parmap.svg?branch=master
:target: https://travis-ci.org/zeehio/parmap
.. image:: https://readthedocs.org/projects/parmap/badge/?version=latest
:target: https://readthedocs.org/projects/parmap/?badge=latest
:alt: Documentation Status
.. image:: https://codecov.io/github/zeehio/parmap/coverage.svg?branch=master
:target: https://codecov.io/github/zeehio/parmap?branch=master
.. image:: https://codeclimate.com/github/zeehio/parmap/badges/gpa.svg
:target: https://codeclimate.com/github/zeehio/parmap
:alt: Code Climate
.. image:: https://img.shields.io/pypi/dm/parmap.svg
:target: https://pypi.python.org/pypi/parmap
:alt: Pypi downloads per month
This small python module implements two functions: ``map`` and
``starmap``.
What does parmap offer?
-----------------------
- Provide an easy to use syntax for both ``map`` and ``starmap``.
- Parallelize transparently whenever possible.
- Handle multiple arguments, even keyword arguments!
- Show a progress bar (requires `tqdm` as optional package)
Installation:
-------------
::
pip install tqdm # for progress bar support
pip install parmap
Usage:
------
Here are some examples with some unparallelized code parallelized with
parmap:
Simple parallelization example:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
::
import parmap
# You want to do:
mylist = [1,2,3]
argument1 = 3.14
argument2 = True
y = [myfunction(x, argument1, mykeyword=argument2) for x in mylist]
# In parallel:
y = parmap.map(myfunction, mylist, argument1, mykeyword=argument2)
Show a progress bar:
~~~~~~~~~~~~~~~~~~~~~
Requires ``pip install tqdm``
::
# You want to do:
y = [myfunction(x) for x in mylist]
# In parallel, with a progress bar
y = parmap.map(myfunction, mylist, pm_pbar=True)
Passing multiple arguments:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
::
# You want to do:
z = [myfunction(x, y, argument1, argument2, mykey=argument3) for (x,y) in mylist]
# In parallel:
z = parmap.starmap(myfunction, mylist, argument1, argument2, mykey=argument3)
# You want to do:
listx = [1, 2, 3, 4, 5, 6]
listy = [2, 3, 4, 5, 6, 7]
param = 3.14
param2 = 42
listz = []
for (x, y) in zip(listx, listy):
listz.append(myfunction(x, y, param1, param2))
# In parallel:
listz = parmap.starmap(myfunction, zip(listx, listy), param1, param2)
Advanced: Multiple parallel tasks running in parallel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this example, Task1 uses 5 cores, while Task2 uses 3 cores. Both tasks start
to compute simultaneously, and we print a message as soon as any of the tasks
finishes, retreiving the result.
::
import parmap
def task1(item):
return 2*item
def task2(item):
return 2*item + 1
items1 = range(500000)
items2 = range(500)
with parmap.map_async(task1, items1, pm_processes=5) as result1:
with parmap.map_async(task2, items2, pm_processes=3) as result2:
data_task1 = None
data_task2 = None
task1_working = True
task2_working = True
while task1_working or task2_working:
result1.wait(0.1)
if task1_working and result1.ready():
print("Task 1 has finished!")
data_task1 = result1.get()
task1_working = False
result2.wait(0.1)
if task2_working and result2.ready():
print("Task 2 has finished!")
data_task2 = result2.get()
task2_working = False
#Further work with data_task1 or data_task2
map and starmap already exist. Why reinvent the wheel?
---------------------------------------------------------
The existing functions have some usability limitations:
- The built-in python function ``map`` [#builtin-map]_
is not able to parallelize.
- ``multiprocessing.Pool().starmap`` [#multiproc-starmap]_
is only available in python-3.3 and later versions.
- ``multiprocessing.Pool().map`` [#multiproc-map]_
does not allow any additional argument to the mapped function.
- ``multiprocessing.Pool().starmap`` allows passing multiple arguments,
but in order to pass a constant argument to the mapped function you
will need to convert it to an iterator using
``itertools.repeat(your_parameter)`` [#itertools-repeat]_
``parmap`` aims to overcome this limitations in the simplest possible way.
Additional features in parmap:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Create a pool for parallel computation automatically if possible.
- ``parmap.map(..., ..., pm_parallel=False)`` # disables parallelization
- ``parmap.map(..., ..., pm_processes=4)`` # use 4 parallel processes
- ``parmap.map(..., ..., pm_pbar=True)`` # show a progress bar (requires tqdm)
- ``parmap.map(..., ..., pm_pool=multiprocessing.Pool())`` # use an existing
pool, in this case parmap will not close the pool.
- ``parmap.map(..., ..., pm_chunksize=3)`` # size of chunks (see
multiprocessing.Pool().map)
Limitations:
-------------
``parmap.map()`` and ``parmap.starmap()`` (and their async versions) have their own
arguments (``pm_parallel``, ``pm_pbar``...). Those arguments are never passed
to the underlying function. In the following example, ``myfun`` will receive
``myargument``, but not ``pm_parallel``. Do not write functions that require
keyword arguments starting with ``pm_``, as ``parmap`` may need them in the future.
::
parmap.map(myfun, mylist, pm_parallel=True, myargument=False)
Additionally, there are other keyword arguments that should be avoided in the
functions you write, because of parmap backwards compatibility reasons. The list
of conflicting arguments is: ``parallel``, ``chunksize``, ``pool``,
``processes``, ``callback``, ``error_callback`` and ``parmap_progress``.
Acknowledgments:
----------------
This package started after `this question <https://stackoverflow.com/q/5442910/446149>`_,
when I offered this `answer <http://stackoverflow.com/a/21292849/446149>`_,
taking the suggestions of J.F. Sebastian for his `answer <http://stackoverflow.com/a/5443941/446149>`_
Known works using parmap
---------------------------
- Davide Gerosa, Michael Kesden, "PRECESSION. Dynamics of spinning black-hole
binaries with python." `arXiv:1605.01067 <https://arxiv.org/abs/1605.01067>`_, 2016
References
-----------
.. [#builtin-map] http://docs.python.org/dev/library/functions.html#map
.. [#multiproc-starmap] http://docs.python.org/dev/library/multiprocessing.html#multiprocessing.pool.Pool.starmap
.. [#multiproc-map] http://docs.python.org/dev/library/multiprocessing.html#multiprocessing.pool.Pool.map
.. [#itertools-repeat] http://docs.python.org/2/library/itertools.html#itertools.repeat
======
.. image:: https://travis-ci.org/zeehio/parmap.svg?branch=master
:target: https://travis-ci.org/zeehio/parmap
.. image:: https://readthedocs.org/projects/parmap/badge/?version=latest
:target: https://readthedocs.org/projects/parmap/?badge=latest
:alt: Documentation Status
.. image:: https://codecov.io/github/zeehio/parmap/coverage.svg?branch=master
:target: https://codecov.io/github/zeehio/parmap?branch=master
.. image:: https://codeclimate.com/github/zeehio/parmap/badges/gpa.svg
:target: https://codeclimate.com/github/zeehio/parmap
:alt: Code Climate
.. image:: https://img.shields.io/pypi/dm/parmap.svg
:target: https://pypi.python.org/pypi/parmap
:alt: Pypi downloads per month
This small python module implements two functions: ``map`` and
``starmap``.
What does parmap offer?
-----------------------
- Provide an easy to use syntax for both ``map`` and ``starmap``.
- Parallelize transparently whenever possible.
- Handle multiple arguments, even keyword arguments!
- Show a progress bar (requires `tqdm` as optional package)
Installation:
-------------
::
pip install tqdm # for progress bar support
pip install parmap
Usage:
------
Here are some examples with some unparallelized code parallelized with
parmap:
Simple parallelization example:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
::
import parmap
# You want to do:
mylist = [1,2,3]
argument1 = 3.14
argument2 = True
y = [myfunction(x, argument1, mykeyword=argument2) for x in mylist]
# In parallel:
y = parmap.map(myfunction, mylist, argument1, mykeyword=argument2)
Show a progress bar:
~~~~~~~~~~~~~~~~~~~~~
Requires ``pip install tqdm``
::
# You want to do:
y = [myfunction(x) for x in mylist]
# In parallel, with a progress bar
y = parmap.map(myfunction, mylist, pm_pbar=True)
Passing multiple arguments:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
::
# You want to do:
z = [myfunction(x, y, argument1, argument2, mykey=argument3) for (x,y) in mylist]
# In parallel:
z = parmap.starmap(myfunction, mylist, argument1, argument2, mykey=argument3)
# You want to do:
listx = [1, 2, 3, 4, 5, 6]
listy = [2, 3, 4, 5, 6, 7]
param = 3.14
param2 = 42
listz = []
for (x, y) in zip(listx, listy):
listz.append(myfunction(x, y, param1, param2))
# In parallel:
listz = parmap.starmap(myfunction, zip(listx, listy), param1, param2)
Advanced: Multiple parallel tasks running in parallel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this example, Task1 uses 5 cores, while Task2 uses 3 cores. Both tasks start
to compute simultaneously, and we print a message as soon as any of the tasks
finishes, retreiving the result.
::
import parmap
def task1(item):
return 2*item
def task2(item):
return 2*item + 1
items1 = range(500000)
items2 = range(500)
with parmap.map_async(task1, items1, pm_processes=5) as result1:
with parmap.map_async(task2, items2, pm_processes=3) as result2:
data_task1 = None
data_task2 = None
task1_working = True
task2_working = True
while task1_working or task2_working:
result1.wait(0.1)
if task1_working and result1.ready():
print("Task 1 has finished!")
data_task1 = result1.get()
task1_working = False
result2.wait(0.1)
if task2_working and result2.ready():
print("Task 2 has finished!")
data_task2 = result2.get()
task2_working = False
#Further work with data_task1 or data_task2
map and starmap already exist. Why reinvent the wheel?
---------------------------------------------------------
The existing functions have some usability limitations:
- The built-in python function ``map`` [#builtin-map]_
is not able to parallelize.
- ``multiprocessing.Pool().starmap`` [#multiproc-starmap]_
is only available in python-3.3 and later versions.
- ``multiprocessing.Pool().map`` [#multiproc-map]_
does not allow any additional argument to the mapped function.
- ``multiprocessing.Pool().starmap`` allows passing multiple arguments,
but in order to pass a constant argument to the mapped function you
will need to convert it to an iterator using
``itertools.repeat(your_parameter)`` [#itertools-repeat]_
``parmap`` aims to overcome this limitations in the simplest possible way.
Additional features in parmap:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Create a pool for parallel computation automatically if possible.
- ``parmap.map(..., ..., pm_parallel=False)`` # disables parallelization
- ``parmap.map(..., ..., pm_processes=4)`` # use 4 parallel processes
- ``parmap.map(..., ..., pm_pbar=True)`` # show a progress bar (requires tqdm)
- ``parmap.map(..., ..., pm_pool=multiprocessing.Pool())`` # use an existing
pool, in this case parmap will not close the pool.
- ``parmap.map(..., ..., pm_chunksize=3)`` # size of chunks (see
multiprocessing.Pool().map)
Limitations:
-------------
``parmap.map()`` and ``parmap.starmap()`` (and their async versions) have their own
arguments (``pm_parallel``, ``pm_pbar``...). Those arguments are never passed
to the underlying function. In the following example, ``myfun`` will receive
``myargument``, but not ``pm_parallel``. Do not write functions that require
keyword arguments starting with ``pm_``, as ``parmap`` may need them in the future.
::
parmap.map(myfun, mylist, pm_parallel=True, myargument=False)
Additionally, there are other keyword arguments that should be avoided in the
functions you write, because of parmap backwards compatibility reasons. The list
of conflicting arguments is: ``parallel``, ``chunksize``, ``pool``,
``processes``, ``callback``, ``error_callback`` and ``parmap_progress``.
Acknowledgments:
----------------
This package started after `this question <https://stackoverflow.com/q/5442910/446149>`_,
when I offered this `answer <http://stackoverflow.com/a/21292849/446149>`_,
taking the suggestions of J.F. Sebastian for his `answer <http://stackoverflow.com/a/5443941/446149>`_
Known works using parmap
---------------------------
- Davide Gerosa, Michael Kesden, "PRECESSION. Dynamics of spinning black-hole
binaries with python." `arXiv:1605.01067 <https://arxiv.org/abs/1605.01067>`_, 2016
References
-----------
.. [#builtin-map] http://docs.python.org/dev/library/functions.html#map
.. [#multiproc-starmap] http://docs.python.org/dev/library/multiprocessing.html#multiprocessing.pool.Pool.starmap
.. [#multiproc-map] http://docs.python.org/dev/library/multiprocessing.html#multiprocessing.pool.Pool.map
.. [#itertools-repeat] http://docs.python.org/2/library/itertools.html#itertools.repeat
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
parmap-1.5.0.tar.gz
(20.7 kB
view hashes)
Built Distribution
parmap-1.5.0-py2.py3-none-any.whl
(11.4 kB
view hashes)
Close
Hashes for parmap-1.5.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c9a87ae5399f13820616758c8a256d06f2b89f11159ff3ff0a02ed9d0c387d91 |
|
MD5 | f6e1ad3121d962926e7d64dc6f1233fa |
|
BLAKE2b-256 | 5c24d2b50d27f36f02cbf70dee6b7650c7294920a0eb28289b98ce8fc4667c2e |