Skip to main content

Simple object-based multiprocessing

Project description

ObjectProxyPool - simple object-based multiprocessing

This package provides the implementation of an ObjectProxyPool, a multiprocessing pool of instances of a specified class. The pool object features all methods that the original class provides. Whenever a methods is called, this call is applied to all remote copies; the results are computed in parallel, collected, and returned as an array. This makes it very easy to implement object-based parallelism.

Motivation

Object-based parallelism is useful if an object-based workflow should be repeated several times, e.g. to assess the impact of stochasticity. In modelling, for example, we may want to repeat a simulation with multiple instances of the model and compare the results.

The approach has significant advantages if the object used for the computations is expensive to initialize. As the instances can be reused for multiple tasks, initialization needs to happen only once for each process.

Example usage

from objectproxypool import ProxyPool, Unpacker, SharedArrayWrapper
import numpy as np
import os
import time
from itertools import repeat


# Define some class providing the
# the functionality we are interested in
class MyClass:
    def __init__(self) -> None:
        np.random.seed(os.getpid())
        self.property = None
    
    def get_normal_sample(self, mean, std):
        return mean + np.random.randn(10) * std
    
    def set_property(self, value):
        self.property = value
        
    def add_and_get_property(self, *args):
        return self.property, sum(args)

    def fill_and_compute_sum(self, arr, i, value):
        
        # add some delay for dempnstration purposes only
        time.sleep(0.2)
        
        # The with statment is needed to access and close
        # a version of the shared memory in this process.
        with Unpacker(arr) as arr:
            
            # Here we can use `arr` as any other numpy array. 
            arr[i] = value
            return arr.sum()


if __name__ == "__main__":
    # Create a pool of 4 instances of MyClass, each running
    # in a separate process. Set `separate_processes=False`
    # to work with threads instead of processes.
    # (Caution: if worker_count is larger than the number of
    # available CPUs, the performance can be bad!)
    with ProxyPool(MyClass, worker_count=4, separate_processes=True) as pool:
        # We can easily parallelize a task by letting
        # each remote instance do the same. For example,
        # we obtain a sample of normally distributed random
        # numbers in parallel.
        print(pool.get_normal_sample(10, 1))

        # We can change the state of the remote instances.
        # `map_args=True` makes that each worker receives
        # one particular number of the range 1:4.
        # Without this argument, each worker would receive the
        # argument `range(4)`, i.e., all four numbers.
        # The argument `synchronize=True` makes sure that
        # each worker gets a value before any worker
        # continues. This ensures that the property
        # is set in all workers.
        pool.set_property(range(4), map_args=True, synchronize=True)

        # Add numbers to the property
        # Even though we have only four workers, we can reuse
        # the workers to do multiple tasks until all the
        # work is done.
        print(pool.add_and_get_property(range(20), range(20), map_args=True))


        # We can also exchange data via shared arrays.
        # To that end, wrap an array that we want to share with a SharedArrayWrapper.
        # Note: this only works properly if we use separate processes.
        # If not, we do not need shared memory anyway!
        arr = SharedArrayWrapper(np.zeros(10))

        # Passing the shared memory is as easy as passing the array as argument. 
        # However, we need to use the `remoteArray` property of the array wrapper.
        sums = pool.fill_and_compute_sum(repeat(arr.remoteArray), range(10), range(10), map_args=True)
        
        # We see that the array has been filled as desired
        print(arr.array)
        
        # When the entries were filled, the array was not completely full yet. 
        # Here, we see that the method was indeed processes in parallel,
        # since the input array is similar in several instances, as 
        # indicated by equal sums of the entries.
        print(sums)

    # If it is required that the instances receive and return values
    # in a particular order, use the `dedicated` flag:
    with ProxyPool(MyClass, worker_count=4, separate_processes=True, dedicated=True) as pool:

        # Set each worker to a particular value
        # Note that if dedicated=True, the number of arguments must
        # match the number of workers!
        pool.set_property(range(4), map_args=True)

        # Now, the results are returned in the same order as the
        # arguments were passed. Note that for dedicated workers,
        # we do not need to use synchronize=True, as each worker
        # always receives exactly one argument.
        print(pool.add_and_get_property(1, 1))

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

objectproxypool-0.0.7.tar.gz (29.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

objectproxypool-0.0.7-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file objectproxypool-0.0.7.tar.gz.

File metadata

  • Download URL: objectproxypool-0.0.7.tar.gz
  • Upload date:
  • Size: 29.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for objectproxypool-0.0.7.tar.gz
Algorithm Hash digest
SHA256 c66ae775d7a5acfe2b99de238bb53e548dead36f4c67972023c8d1ee4abfcce9
MD5 e7026be76c845f96aeb520fb5733796f
BLAKE2b-256 d37f59de04c96bd4fd5d8963e0888dc2bf852e5e7f1e7387ef1d5e33e33f3469

See more details on using hashes here.

File details

Details for the file objectproxypool-0.0.7-py3-none-any.whl.

File metadata

File hashes

Hashes for objectproxypool-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a9460257eda1abca196192aae90389eedc90e9e780fc7602c39fe40ae57d8cbf
MD5 4c8137998dcd66a34e2a593d1897871e
BLAKE2b-256 922dcad0268e1fda288d0613ccd4804f266505c8e4a4c86b722ed61257bd131d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page