Skip to main content

multiprocessing on row data using user defined functions

Project description

# Multi-Processing on Row Data This package is simply a python decorator to help your perform multiprocessing based on row data. This is especially useful if your are trying to clean large size data.

Suppose you have a numpy array called “data” (can be any dimension) and you want to repeat some operation on each row (i.e 0-th dimension). This decorator can help you painlessly perform the operation using multiprocessing.

The idea is simple. The decorator will automatically split your data into several subsets (by rows) and run the function using multiprocessing. Once it is done, the code will combine all results and return the modified numpy array.

# Requirements To use it, you must install the follow python packages from pip install numpy, pathos, itertools

# Installation:

pip install mprows

# Usage: ## Function Format when you build your own function, it must built based in this form:

MyFunc(data, par={‘par1’: value1, ‘par2’: value2})

  • function name is arbitary

  • function must contain only two arguemnts, one is called “data”, (i.e your data, a numpy array) while the other one is called “par” (a dict that contains all the other arguments of your function. It will be fixed in the multiprocessing procedure.)

  • This decorator can only decorate a “function”, not a method in a class. If you want to perform multiprocessing a in method, you can define a function in the method which wraps all the code of your method and use the decorator on it.

  • The output of your function should be another numpy array which has the same number of rows.

  • since this code will split your data based on row, you must make sure that the operation is “row independent”.

## Example You must first import mprows:

from mprows import mprows

Then you can simply program your own function and use mprows as a decorator for multiprocessing.

To multiprocess a function: <p align=”center”> <img src=”./img/func.png”> </p>

To multiprocess a method in a class: <p align=”center”> <img src=”./img/class.png”> </p>

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mprows-0.1.2.tar.gz (3.1 kB view details)

Uploaded Source

File details

Details for the file mprows-0.1.2.tar.gz.

File metadata

  • Download URL: mprows-0.1.2.tar.gz
  • Upload date:
  • Size: 3.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.1.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.5

File hashes

Hashes for mprows-0.1.2.tar.gz
Algorithm Hash digest
SHA256 79cd49bfeaf91f30122bb7847c99fc7762ed517f693e2e2c4b78172006403821
MD5 f6398823debecffce2389f63ab09a310
BLAKE2b-256 8d90e61757816d64e0e36675a030aafcf8e48a187a63141f7f1e093093918fe4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page