Easy parallel processing in Python
Paraproc is a simple library that helps you easily parallelize your computation (over independent chunks of data) across multiple processes in Python, especially when you want to mix callings to external command line programs and hand brew Python functions together in your data processing pipeline.
Under the hood, it combines subprocess and multiprocessing, and uses a process pool to schedule the jobs. It also provides a numpy.ndarray interface to access shared-memory across multiple processes.
Paraproc supports both Python 2 and 3, with numpy as the only external dependency. It is contained in only one Python file, so it can be easily copied into your project. (The copyright and license notice must be retained.)
Code snippets that demonstrate the basic usage of the library can be found later in this documentation, and in the demo_*.py files.
Bugs can be reported to https://github.com/herrlich10/paraproc. The code can also be found there.
Execute commands in parallel
You can run both Python codes and command line programs in parallel:
import os import paraproc def my_job(): print(os.getpid()) pc = paraproc.PooledCaller() for k in range(5): pc.check_call(my_job) for k in range(5): pc.check_call('echo $$', shell=True) # For linux/mac pc.wait()
The pc.check_call() method will return immediatedly. The actual execution of the queued commands are delayed until you call pc.wait().
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size paraproc-0.1.3-py3-none-any.whl (6.4 kB)||File type Wheel||Python version py3||Upload date||Hashes View hashes|
|Filename, size paraproc-0.1.3.tar.gz (6.2 kB)||File type Source||Python version None||Upload date||Hashes View hashes|