npshmex

ProcessPoolExecutor that passes numpyarrays through shared memory

Project description

npshmex
=======

Npshmex provides a drop-in replacement for concurrent.futures.ProcessPoolExecutor,
using shared memory provided by the [SharedArray](https://gitlab.com/tenzing/shared-array) package
(rather than pickle) to transfers numpy arrays between processes.

Synopsis:
```python
import numpy as np
from npshmex import ProcessPoolExecutor

def add_one(x):
return x + 1

ex = ProcessPoolExecutor()
big_data = np.ones(int(2e7))

f = ex.submit(add_one, big_data)
print(f.result()[0]) # 2.0
```
The last two lines take about ~290 ms on my laptop, but ~1250 ms using
`concurrent.futures.ProcessPoolExecutor`: more than a factor four difference.
The latter also requires about twice as much memory (based on the threshold at
which I get a MemoryError).

For this ridiculously simple task, the multiprocessing overhead is dominant even when
spawning a single child process (a bare `add_one(big_data)` would take only ~55 ms).
However, since part of this overhead is in the parent process, it will bottleneck
even more complex tasks when scaled over enough processes.

How it works
--------------

Python multiprocessing uses pickle to serialize data for transfer between processes.
When passing around large numpy arrays, this can quickly become a bottleneck.

Npshmex's ProcessPoolExecutor-replacement instead transfers input and output numpy arrays
using shared memory (`/dev/shm`).
Dictionary outputs with numpy arrays as values are also supported.
Only the shared-memory `filenames' are actually transferred between processes.

Note that npshmex copies data from numpy arrays into shared memory
to transfer them. It doesn't copy it again on retrieval; it just creates the
numpy array with the shared memory backing it.
Still, if you are transferring the same array back and forth,
this amounts to two unnecessary memory copies.
You can avoid these, and the use of npshmex, by managing the shared memory yourself:
```python
from concurrent.futures import ProcessPoolExecutor
import SharedArray

def add_one(shm_key):
x = SharedArray.attach(shm_key)
x += 1

shm_key = 'shm://test'
ex = ProcessPoolExecutor()

big_data = SharedArray.create(shm_key, int(2e7))
big_data += 1

f = ex.submit(add_one, shm_key)
f.result()
SharedArray.delete(shm_key)
print(x[0]) # 2.0
```
The last four lines now only take ~130 ms on my laptop, which is over
twice as fast as npshmex. However, as you can see, it involves
a more substantial rewrite of your code.

Clearing shared memory
------------------------

Npshmex tells SharedArray to mark shared memory for deletion as soon as it has created
numpy arrays back from it. As explained in the
[SharedArray](https://gitlab.com/tenzing/shared-array) documention,
you'll keep the numpy array until you lose the last reference to it
(as with regular python objects).

If your program exits while data is being transfered between processes,
some shared files will remain in `/dev/shm`. You can manually clear all npshmex-associated
shared memory from all processes on the machine with `npshmex.shm_clear()`.
Otherwise, it will be up to you, your operating system, or your system administrator
to clean up the mess...

v0.0.1
--------
* Initial release

Project details

Release history Release notifications | RSS feed

0.2.1

Dec 1, 2020

0.2.0

Mar 3, 2020

0.1.2

Apr 17, 2019

0.1.1

Apr 17, 2019

0.1.0

Apr 17, 2019

0.0.5

Apr 17, 2019

This version

0.0.4

Apr 17, 2019

0.0.3

Apr 17, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

npshmex-0.0.4.tar.gz (2.7 kB view hashes)

Uploaded Apr 17, 2019 Source

Hashes for npshmex-0.0.4.tar.gz

Hashes for npshmex-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`c0cbd7fada50890700870117991c29c9e9bb59956c789867061387df48150636`
MD5	`dcc23221a66bfac6e7d22fcf01976dc0`
BLAKE2b-256	`927c9435e13be746544fa087f7b78dc6c5442616dadc07d7fff6286260eb1a88`