Parallel Stream Task for Python
Project description
Streamtask
Streamtask is a lightweight python parallel framework for parallelizing the computationally intensive pipelines. It is similar to Map/Reduce, while it is more lightweight. It parallelizes each module in the pipeline with a given processing number to make it possible to leverage the different speeds in different modules. It improves the performance especially there are some heavy I/O operations in the pipeline.
Example
Suppose we want to process the data in a pipline with 3 blocks, f1, f2 and f3. We can use the following code to parallelize the processing.
from streamtask import StreamTask
def f1():
for i in range(1000000):
yield i * 2
def f2(n, add, third = 0.01):
return n + add + third
def f3(n):
return n + 1
if __name__ == "__main__":
sl = StreamTask()
sl.add_module(f1, 2) # use 2 process to compute
sl.add_module(f2, 2, args = [0.5], kwargs = {'third' : 0.02})
sl.add_module(f3, 2)
#sl.run_serial()
sl.run()
sl.join()
print(sl.get_results())
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
streamtask-0.0.4.tar.gz
(3.9 kB
view hashes)
Built Distribution
Close
Hashes for streamtask-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad33a7a6518c77194b1b9fc6c4dbd6f6e7c3be9bd3a0ecb4542146336337244c |
|
MD5 | 5a881e84dd42fdbe9988535ec91e57df |
|
BLAKE2b-256 | 95e9d4981850d9fce5e8196899e6c22dc0f3c14069cea0c132c5feb2e0e58001 |