Parallel Stream Task for Python
Project description
Streamtask
Streamtask is a lightweight python parallel framework for parallelizing the computationally intensive pipelines. It is similar to Map/Reduce, while it is more lightweight. It parallelizes each module in the pipeline with a given processing number to make it possible to leverage the different speeds in different modules. It improves the performance especially there are some heavy I/O operations in the pipeline.
Example
Suppose we want to process the data in a pipline with 3 blocks, f1, f2 and f3. We can use the following code to parallelize the processing.
from streamtask import StreamTask
def f1():
for i in range(1000000):
yield i * 2
def f2(n, add, third = 0.01):
return n + add + third
def f3(n):
return n + 1
if __name__ == "__main__":
sl = StreamTask()
sl.add_module(f1, 2) # use 2 process to compute
sl.add_module(f2, 2, args = [0.5], third = 0.02)
sl.add_module(f3, 2)
#sl.run_serial()
sl.run()
sl.join()
print(sl.get_results())
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
streamtask-0.0.5.tar.gz
(3.8 kB
view hashes)
Built Distribution
Close
Hashes for streamtask-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4e8a1e6935e601aef89eabb60eaf93bc12a51c26176b319c4a522f1459e1222f |
|
MD5 | af884c8765933360fc168f2c49669b5e |
|
BLAKE2b-256 | fe315d8de6667d0067921223edee0dd50dc324b1a3161dbb8f9291081169be11 |