map and starmap implementations passing additional arguments and parallelizing if possible
Project description
This small python module implements two functions: map and starmap.
What does parmap offer?
Provide an easy to use syntax for both map and starmap.
Parallelize transparently whenever possible.
Handle multiple arguments, even keyword arguments!
Show a progress bar (requires tqdm as optional package)
Installation:
pip install tqdm # for progress bar support pip install parmap
Usage:
Here are some examples with some unparallelized code parallelized with parmap:
Simple parallelization example:
import parmap # You want to do: mylist = [1,2,3] argument1 = 3.14 argument2 = True y = [myfunction(x, argument1, mykeyword=argument2) for x in mylist] # In parallel: y = parmap.map(myfunction, mylist, argument1, mykeyword=argument2)
Show a progress bar:
Requires pip install tqdm
# You want to do: y = [myfunction(x) for x in mylist] # In parallel, with a progress bar y = parmap.map(myfunction, mylist, pm_pbar=True)
Passing multiple arguments:
# You want to do: z = [myfunction(x, y, argument1, argument2, mykey=argument3) for (x,y) in mylist] # In parallel: z = parmap.starmap(myfunction, mylist, argument1, argument2, mykey=argument3) # You want to do: listx = [1, 2, 3, 4, 5, 6] listy = [2, 3, 4, 5, 6, 7] param = 3.14 param2 = 42 listz = [] for (x, y) in zip(listx, listy): listz.append(myfunction(x, y, param1, param2)) # In parallel: listz = parmap.starmap(myfunction, zip(listx, listy), param1, param2)
Advanced: Multiple parallel tasks running in parallel
In this example, Task1 uses 5 cores, while Task2 uses 3 cores. Both tasks start to compute simultaneously, and we print a message as soon as any of the tasks finishes, retreiving the result.
import parmap def task1(item): return 2*item def task2(item): return 2*item + 1 items1 = range(500000) items2 = range(500) with parmap.map_async(task1, items1, pm_processes=5) as result1: with parmap.map_async(task2, items2, pm_processes=3) as result2: data_task1 = None data_task2 = None task1_working = True task2_working = True while task1_working or task2_working: result1.wait(0.1) if task1_working and result1.ready(): print("Task 1 has finished!") data_task1 = result1.get() task1_working = False result2.wait(0.1) if task2_working and result2.ready(): print("Task 2 has finished!") data_task2 = result2.get() task2_working = False #Further work with data_task1 or data_task2
map and starmap already exist. Why reinvent the wheel?
The existing functions have some usability limitations:
The built-in python function map [1] is not able to parallelize.
multiprocessing.Pool().starmap [2] is only available in python-3.3 and later versions.
multiprocessing.Pool().map [3] does not allow any additional argument to the mapped function.
multiprocessing.Pool().starmap allows passing multiple arguments, but in order to pass a constant argument to the mapped function you will need to convert it to an iterator using itertools.repeat(your_parameter) [4]
parmap aims to overcome this limitations in the simplest possible way.
Additional features in parmap:
Create a pool for parallel computation automatically if possible.
parmap.map(..., ..., pm_parallel=False) # disables parallelization
parmap.map(..., ..., pm_processes=4) # use 4 parallel processes
parmap.map(..., ..., pm_pbar=True) # show a progress bar (requires tqdm)
parmap.map(..., ..., pm_pool=multiprocessing.Pool()) # use an existing pool, in this case parmap will not close the pool.
parmap.map(..., ..., pm_chunksize=3) # size of chunks (see multiprocessing.Pool().map)
Limitations:
parmap.map() and parmap.starmap() (and their async versions) have their own arguments (pm_parallel, pm_pbar…). Those arguments are never passed to the underlying function. In the following example, myfun will receive myargument, but not pm_parallel. Do not write functions that require keyword arguments starting with pm_, as parmap may need them in the future.
parmap.map(myfun, mylist, pm_parallel=True, myargument=False)
Additionally, there are other keyword arguments that should be avoided in the functions you write, because of parmap backwards compatibility reasons. The list of conflicting arguments is: parallel, chunksize, pool, processes, callback, error_callback and parmap_progress.
Acknowledgments:
This package started after this question, when I offered this answer, taking the suggestions of J.F. Sebastian for his answer
Known works using parmap
Davide Gerosa, Michael Kesden, “PRECESSION. Dynamics of spinning black-hole binaries with python.” arXiv:1605.01067, 2016
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for parmap-1.5.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b459525a076ddba31c55e4721c79c5f4693f1908fbdb2cb9090b2ed9d8cf03ef |
|
MD5 | c78bebf6b8d13f925f5f228f8ec4bc3f |
|
BLAKE2b-256 | 338f608aefca30002d69d18535cf65c3dba9ce1610f01ab112138e1cd3e31f76 |