Asynchronous Computing Made ESI
Project description
ACME: Asynchronous Computing Made ESI
Table of Contents
Summary
The objective of ACME (pronounced "ak-mee") is to provide easy-to-use wrappers for calling Python functions concurrently ("embarassingly parallel workloads"). ACME is developed at the Ernst Strüngmann Institute (ESI) gGmbH for Neuroscience in Cooperation with Max Planck Society and released free of charge under the BSD 3-Clause "New" or "Revised" License. ACME relies heavily on the concurrent processing library dask and was primarily designed to facilitate the use of SLURM on the ESI HPC cluster (although other HPC infrastructure running SLURM can be leveraged as well). Local multi-processing hardware (i.e., multi-core CPUs) is fully supported too. ACME is itself used as the parallelization engine of SyNCoPy.
Installation
ACME can be installed with pip
pip install esi-acme
or via conda
conda install -c conda-forge esi-acme
To get the latest development version, simply clone our GitHub repository:
git clone https://github.com/esi-neuroscience/acme.git
cd acme/
pip install -e .
Usage
Basic Examples
Simplest use, everything is done automatically.
from acme import ParallelMap
def f(x, y, z=3):
return (x + y) * z
with ParallelMap(f, [2, 4, 6, 8], 4) as pmap:
pmap.compute()
See also our Quickstart Guide.
Intermediate Examples
Set number of function calls via n_inputs
import numpy as np
from acme import ParallelMap
def f(x, y, z=3, w=np.zeros((3, 1)), **kwargs):
return (sum(x) + y) * z * w.max()
pmap = ParallelMap(f, [2, 4, 6, 8], [2, 2], z=np.array([1, 2]), w=np.ones((8, 1)), n_inputs=2)
with pmap as p:
p.compute()
More details in Override Automatic Input Argument Distribution
Advanced Use
Allocate custom client
object and recycle it for several computations
(use slurm_cluster_setup
on non-ESI HPC infrastructure or local_cluster_setup
when working on your local machine)
import numpy as np
from acme import ParallelMap, esi_cluster_setup
def f(x, y, z=3, w=np.zeros((3, 1)), **kwargs):
return (sum(x) + y) * z * w.max()
def g(x, y, z=3, w=np.zeros((3, 1)), **kwargs):
return (max(x) + y) * z * w.sum()
n_workers = 200
client = esi_cluster_setup(partition="8GBXS", n_workers=n_workers)
x = [2, 4, 6, 8]
z = range(n_workers)
w = np.ones((8, 1))
pmap = ParallelMap(f, x, np.random.rand(n_workers), z=z, w=w, n_inputs=n_workers)
with pmap as p:
p.compute()
pmap = ParallelMap(g, x, np.random.rand(n_workers), z=z, w=w, n_inputs=n_workers)
with pmap as p:
p.compute()
For more information see Reuse Worker Clients
Handling Results
Load Results From Files
By default, results are saved to disk in HDF5 format and can be accessed using
the results_container
attribute of ParallelMap
:
def f(x, y, z=3):
return (x + y) * z
with ParallelMap(f, [2, 4, 6, 8], 4) as pmap:
filenames = pmap.compute()
Example loading code:
import h5py
import numpy as np
out = np.zeros((4,))
with h5py.File(pmap.results_container, "r") as h5f:
for k, key in enumerate(h5f.keys()):
out[k] = h5f[key]["result_0"][()]
See also Where Are My Results?
Collect Results in Single HDF5 Dataset
If possible, results can be slotted into a single HDF5 dataset:
def f(x, y, z=3):
return (x + y) * z
with ParallelMap(f, [2, 4, 6, 8], 4, result_shape=(None,)) as pmap:
pmap.compute()
Example loading code:
import h5py
with h5py.File(pmap.results_container, "r") as h5f:
out = h5f["result_0"][()] # returns a NumPy array of shape (4,)
More examples can be found in Collect Results in Single Dataset
Collect Results in Local Memory
This is possible but not recommended.
def f(x, y, z=3):
return (x + y) * z
with ParallelMap(f, [2, 4, 6, 8], 4, write_worker_results=False) as pmap:
result = pmap.compute() # returns a 4-element list
Alternatively, create an in-memory NumPy array
with ParallelMap(f, [2, 4, 6, 8], 4, write_worker_results=False, result_shape=(None,)) as pmap:
result = pmap.compute() # returns a NumPy array of shape (4,)
Debugging
Use the debug
keyword to perform all function calls in the local thread of
the active Python interpreter
def f(x, y, z=3):
return (x + y) * z
with ParallelMap(f, [2, 4, 6, 8], 4, z=None) as pmap:
results = pmap.compute(debug=True)
This way tools like pdb
or %debug
IPython magics can be used.
More information can be found in the FAQ.
Documentation and Contact
To report bugs or ask questions please use our GitHub issue tracker. More usage details and background information is available in our online documentation.
Resources
- ACME Presentation at deRSE23 - Conference for Research Software Engineering in Germany
- ACME Demo presented at the 4th annual Data Scientist Community Meeting
- ACME Tutorials
- ACME FAQ
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file esi-acme-2023.12.tar.gz
.
File metadata
- Download URL: esi-acme-2023.12.tar.gz
- Upload date:
- Size: 61.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 10f140ff169b96565cbb824982123e71df707792e37e0329e2acdeeb3171a1d8 |
|
MD5 | ef7fa7d6fddc19100d2da1889594e878 |
|
BLAKE2b-256 | 81fc1a4d607ad58eb0f81aca7d1d4526a8ab2dc674157ba8cac7c1df7532d240 |
File details
Details for the file esi_acme-2023.12-py3-none-any.whl
.
File metadata
- Download URL: esi_acme-2023.12-py3-none-any.whl
- Upload date:
- Size: 71.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1f7ca0b58b8a1f9f7b28ed4aedd082863645b7be607e39912ff68e6706d1717 |
|
MD5 | 97d306ec0c854fdd3925b12f72ab19e7 |
|
BLAKE2b-256 | 92f0087d9a18b6e0cc9bad2106cb78a10565ab681742aaa2a8ae74c84471ea76 |