Interface for NumPy ndarray using multiprocessing SharedMemory
Project description
shared-ndarray2
SharedNDArray encapsulates a NumPy ndarray interface for using shared memory in
multiprocessing, using multiprocessing.shared_memory in Python 3.8+.
Quick Start
import multiprocessing
from multiprocessing.managers import SharedMemoryManager
import numpy as np
from shared_ndarray2 import SharedNDArray
def process_data(arr: SharedNDArray):
# Work with data
arr[:] += 1
with SharedMemoryManager() as mem_mgr:
arr = SharedNDArray.from_array(mem_mgr, np.arange(1024))
p = multiprocessing.Process(target=process_data, args=(arr,))
p.start()
p.join()
assert np.all(arr[:] == np.arange(1, 1025))
Requirements
- Python 3.8+
- NumPy 1.21+
Similar Projects
- SharedArray - POSIX-only. Quite a different
paradigm, uses pre-Python 3.8 memory-sharing constructs, requires building a C module
with
gcc. - shared-ndarray - POSIX-only. Similar (uses
NumPy ndarray
bufferarg), uses pre-Python 3.8 memory-sharing constructs (requiresposix_ipc).
Usage
Creation
There are three methods for constructing a SharedNDArray.
SharedNDArray()
To create a SharedNDArray object from existing shared memory that represents a NumPy
array, use the regular constructor providing shape and dtype, either with an existing
multiprocessing.SharedMemory object or the name of one:
shm = SharedMemory(create=True, size=1024)
arr = SharedNDArray(shm, (1024,), np.uint8)
# -or-
arr = SharedNDArray(shm.name, (1024,), np.uint8)
SharedNDArray.from_shape() or shared_ndarray.from_shape()
This method allocates shared memory managed by a SharedMemoryManager to represent a NumPy
ndarray with some shape and dtype.
with SharedMemoryManager as mem_mgr:
arr = SharedNDArray.from_shape(mem_mgr, (3, 1024, 1024), dtype=np.uint16)
# ... Use arr with e.g. multiprocessing.Pool or multiprocess.Process
# ... Be sure process instances join/terminate before exiting SharedMemoryManager context manager
Note: shared_ndarray.from_shape() is a standlone function and correctly types the
SharedNDArray, whereas the classmethod might (in mypy) or might not (in pyright)
SharedNDArray.from_array() or shared_ndarray.from_array()
This method allocates shared memory managed by a SharedMemoryManager to represent a some
provided NumPy ndarray and copies that ndarray into the shared memory
x = np.arange(100.0).reshape(2, 2, 25)
with SharedMemoryManager as mem_mgr:
arr = SharedNDArray.from_array(mem_mgr, x)
assert np.all(arr[:] == x[:])
# ... Use arr as above...
Note: shared_ndarray.from_array() is a standlone function and correctly types the
SharedNDArray, whereas the classmethod might (in mypy) or might not (in pyright)
Using like np.ndarray
The point of SharedNDArray is to remove the boilerplate of creating shared memory,
passing around shapes and dtypes and reconstructing np.ndarray objects. SharedNDArray
does this last step with its .get() method, which creates a np.ndarray on-the-fly
using the shared memory buffer. The __getitem__() and __setitem__() methods use the
.get() method to get the np.ndarray to access the data, so multi-dimensional indexing
and slicing work the same as with an ndarray. Other np.ndarray methods are not
directly implemented but may be accessed by first calling .get(), e.g.
arr.get().mean().
Releasing Shared Memory
SharedNDArray implements a __del__() method that calls the
.close()
method on the SharedMemory when the instance is destroyed (i.e. at process exit). When
the shared memory is unlinked in the parent process (either manually with
shm.unlink()
or by exiting a SharedMemoryManager context manager) the shared_memory is properly
released. However if a sub-process is not joined or terminated before the shared memory is
unlinked a warning will be emitted about "leaked shared_memory objects to clean up at shutdown".
.lock attribute
The __init__(), from_shape(), and from_array() methods may be given a lock=True
argument that will also create a multiprocessing.Lock object and include it in the
SharedNDArray, accesible as the .lock attribute. It should be noted, however, that it
doesn't work well to pass a multiprocessing.Lock as an argument to a
multiprocessing.Pool function, for reasons described
here.
Thus by default .lock is set to None.
Typed SharedNDArray
SharedNDArray is able to be typed with NumPy types. When using the from_array()
constructor, it is also able to inherit the type of the ndarray if it is typed using
numpy.typing.NDArray (new in NumPy 1.21). Typing information does not pass through with
slicing (__getitem__), however.
x: npt.NDArray[np.int_] = np.arange(1024)
arr = SharedNDArray(mem_mgr, x) # type of x is SharedNDArray[int_]
arr2 = arr[:] # arr2 is typing.Any
MyPy and NumPy typing compatibility
This package includes type annotations and the py.typed marker file.
Due to the use of a private NumPy type annoation whose location moved in NumPy 1.23.0, mypy has to be configured differently if using NumPy < 1.23.0.
| NumPy Version | mypy config | mypy cli | Pyright config |
|---|---|---|---|
| < 1.23.0 | always_false = NUMPY_1_23 |
--always-false NUMPY_1_23 |
defineConstant = { "NUMPY_1_23" = false } |
| 1.23.0+ | always_true = NUMPY_1_23 |
--always-true NUMPY_1_23 |
defineConstant = { "NUMPY_1_23" = true } |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters