Manage pool of array using shared memory
Project description
Pyarraypool
Transfer numpy array between processes using shared memory.
Why creating this project ?
This library aims to speed up parallel data processing with python and numpy NDArray.
Python GIL does not permit to use multithreading for parallel data processing. It is indeed release when C code / Cython / IO tasks are done but it is still lock for computation tasks.
Alternative to subprocess worker exists but they are not always possible to use. To list few of them:
Few design choices
Python standard library already contains a module to create and manage shared memory.
However it does not permit to manage it as a raw bloc. So performances drop because several system call must be done on each bloc creation / deletion.
In this library:
- shared memory is manage as a "pool".
- array can be attached and are release when refcount reach 0 in every processes.
- a spinlock is used to manage sync between process when bloc are add / removed (this can be improved).
API usage
Here a simple example of how to use library.
import pyarraypool
import multiprocessing
import numpy as np
def task(x, i, value):
# Define a dummy task
x[i, :, :] = value
def main():
arr = np.random.random((100, 200, 500))
I, J, K = arr.shape
with multiprocessing.get_context("spawn").Pool(processes=8, initializer=pyarraypool.start_pool) as pool, \
pyarraypool.object_pool():
# Transfer the array to shared memory
shmarr = pyarraypool.make_transferable(arr)
# Apply task to array
pool.starmap(task, [
(shmarr, i, i) for i in range(I)
])
if __name__ == "__main__":
main()
You can have a look at notebook
/ example
folders for more details.
Developper guide
To build:
pip install maturin
maturin develop --extras test
To test:
# Run rust tests
cargo test
cargo clippy
# Run python tests
pytest -vv
flake8
autopep8 --diff -r python/
mypy .
To format code:
autopep8 -ir python/
isort .
Project status
Project is currently a "POC" and not fully ready for production.
Few benchmark are still missing. API can be improved.
See TODO.md
for more details.
Any help / feedback is welcome 😊 !
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file pyarraypool-0.1.2-cp310-cp310-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: pyarraypool-0.1.2-cp310-cp310-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.0 MB
- Tags: CPython 3.10, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d3157e55f4596d1e4d0b091babceaae69eacde9f7722d63e315e7709d0f64224 |
|
MD5 | ac1fb1cf3a1e4934e75405f1144fd098 |
|
BLAKE2b-256 | dc68a482ed444d9907158901de5f1adf2100664eff5e07390ce77c97bef1240f |
File details
Details for the file pyarraypool-0.1.2-cp39-cp39-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: pyarraypool-0.1.2-cp39-cp39-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.0 MB
- Tags: CPython 3.9, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d5396078799676bd358b610be02cff214a36694a64bd8031ae7c77686e0cc20 |
|
MD5 | 518470257118408af2fc4f28240059f9 |
|
BLAKE2b-256 | 7245a78a07d525f34a5625158f927bd9190609badec7cb35b606471db9583b56 |
File details
Details for the file pyarraypool-0.1.2-cp38-cp38-manylinux_2_28_x86_64.whl
.
File metadata
- Download URL: pyarraypool-0.1.2-cp38-cp38-manylinux_2_28_x86_64.whl
- Upload date:
- Size: 1.0 MB
- Tags: CPython 3.8, manylinux: glibc 2.28+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04190987ee4038498baf4300998f794b360322548aab4d26ea4aa53864ca4948 |
|
MD5 | 7b4e5ba2a9f1f46bbb60cb47a8a9ac94 |
|
BLAKE2b-256 | 0048cb62da5babb48a645d7510545a5f341c1c0c188e6b16dfb594d9923a44e4 |