Thin MapReduce-like layer on top of the Python multiprocessing library.
Project description
Thin MapReduce-like layer on top of the Python multiprocessing library.
Package Installation and Usage
The package is available on PyPI:
python -m pip install mr4mp
The library can be imported in the usual way:
import mr4mp
Examples
Word-Document Index
Suppose we have some functions that we can use to build an index of randomly generated words:
def word(): # Generate a random 7-letter "word". return ''.join(choice(ascii_lowercase) for _ in range(7)) def index(id): # Build an index mapping some random words to an identifier. return {w:{id} for w in {word() for _ in range(100)}} def merge(i, j): # Merge two index dictionaries i and j. return {k:(i.get(k,set()) | j.get(k,set())) for k in i.keys() | j.keys()}
We can then construct an index in the following way:
from random import choice from string import ascii_lowercase from timeit import default_timer start = default_timer() pool = mr4mp.pool() pool.mapreduce(index, merge, range(100)) print("Finished in " + str(default_timer()-start) + "s using " + str(len(pool)) + " process(es).")
The above might yield the following output:
Finished in 0.664681524217187s using 2 process(es).
Suppose we had instead explicitly specified that only one process can be used:
pool = mr4mp.pool(1)
After the above modification, we might see the following output from the code block:
Finished in 2.23329004518571s using 1 process(es).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mr4mp-0.0.5.2.tar.gz
(3.9 kB
view hashes)