No project description provided
Project description
Dask-on-Ray Enabled In Situ Analytics
Installation
Using containers
Doreisa can be install using containers: a Docker image is built. This image can then be used with singularity.
On Grid5000, first, enable Docker with g5k-setup-docker -t. This is only needed to build the images, not to execute the code.
Execute the building script: $ ./build-images.sh. This will build the Docker images, save them to tar files and convert them to singularity images.
Developement
Update the Python environment
To add dependencies to the Python environment, add them via poetry. Then, export them to requirements.txt via:
poetry export -f requirements.txt --output requirements.txt
This file should be copied in docker/analytics/ and docker/simulation. Remove numpy from the file in docker/simulation since another version is already installed with PDI.
Notes (TODEL)
mpic++ main.cpp -Wl,--copy-dt-needed-entries -lpdi -o simulation pdirun mpirun -n 9 --oversubscribe --allow-run-as-root ./simulation
Start the head node:
ray start --head --port=4242 --include-dashboard=True
python3 head.py
mpirun -n 3 singularity exec ./doreisa.sif hostname
If needed: singularity shell
Run Podman: podman run --rm -it --shm-size=2gb --network host -v "$(pwd)":/workspace -w /workspace 'doreisa-simulation:latest' /bin/bash
Run Docker: docker run --rm -it --shm-size=2gb -v "$(pwd)":/workspace -w /workspace 'doreisa_simulation:latest' /bin/bash
poetry install --no-interaction --no-ansi --no-root
TODO
-
Examples of analytics (time derivative)
-
Don't block the simulation code. Send the data and keep going
-
Do some analytics at certain timesteps only, in case of specific events. Example: if the temperature becomes too high, perform the analyics more often (every 10 steps instead of every 100 steps) For parflow, the silulation is performed every dt, but dt can vary accross the simulation
-
Support two scenarios:
- Simulation running on GPU -> can perform the computation in situ, on the same node
- Simulation running on CPU -> should send the data right away, process in transfer Let the user choose if the chunks are stored on the same node, or in another node Using ray placement groups? Dynamically to avoid being out of memory?
We should be able, from to client, to choose a function to execute on the numpy array as soon as available. For example, compute an integral without copying the data, and then sending only the required data.
-
The analytics might want to do a convolution with a small kernel. In this case, we want to avoid sending all the data. Measure this
-
See if Infiniband is not supported in Ray
-
PDI makes a copy only when the data is on the GPU
-
Adastra (Ruche)?
-
Contract: choose which piece of data are needed. We might not want all the available arrays -> don't make a copy in that case. For example, only do the
ray.putevery 100 iterations -
Would be nice to estimate the CO2 emission (if large scale experiment)
!! Prepare a presentation about the work for now -> demo
Doreisa
mpirun -machinefile $OAR_NODEFILE singularity exec ./doreisa.sif hostname mpirun -machinefile $OAR_NODEFILE singularity exec ./doreisa.sif
ZMQ to make remote copies of numpy array
present windw approach as research
Simulation : eulerian vs lagrangian vs semi-lagrangian
Understand the ray scheduling strategy
Same for dask on ray
- Scalability benchmark (!)
- In-situ / in-transfer API
- Feedback loop (for the simulation and analytics)
- Dask on Ray
- Scheduling: Dask, Ray, Dask-on-Ray -> understand better (!)
- Slicing, avoid full object moves (ex: convolutions)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file doreisa-0.1.0.tar.gz.
File metadata
- Download URL: doreisa-0.1.0.tar.gz
- Upload date:
- Size: 5.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
411a868fb6a61959ff49f695f31a397ff191ec1bc17eacbb6c1f0f4b1cca76be
|
|
| MD5 |
a7ffbf072d28f5acdc0968ca6cff6927
|
|
| BLAKE2b-256 |
def529287d2f5f619d8e600051ca96b13b4b5668ea3c80cf71e7f8255679b71c
|
Provenance
The following attestation bundles were made for doreisa-0.1.0.tar.gz:
Publisher:
release.yml on AdrienVannson/doreisa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
doreisa-0.1.0.tar.gz -
Subject digest:
411a868fb6a61959ff49f695f31a397ff191ec1bc17eacbb6c1f0f4b1cca76be - Sigstore transparency entry: 199201664
- Sigstore integration time:
-
Permalink:
AdrienVannson/doreisa@d984b45f94bf733dbcaca66c457624caa9239cdc -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AdrienVannson
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d984b45f94bf733dbcaca66c457624caa9239cdc -
Trigger Event:
push
-
Statement type:
File details
Details for the file doreisa-0.1.0-py3-none-any.whl.
File metadata
- Download URL: doreisa-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80bcf7cc2da5887a45bda03c5f0ca0dbd8f05ac2f8ee3516144da71b09e69338
|
|
| MD5 |
f10a744a5aa218ad1c6b1b1c62eb47ec
|
|
| BLAKE2b-256 |
91b52167da211b5966c64209a1abff2864f7cff2d0c28b58263147b46759e186
|
Provenance
The following attestation bundles were made for doreisa-0.1.0-py3-none-any.whl:
Publisher:
release.yml on AdrienVannson/doreisa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
doreisa-0.1.0-py3-none-any.whl -
Subject digest:
80bcf7cc2da5887a45bda03c5f0ca0dbd8f05ac2f8ee3516144da71b09e69338 - Sigstore transparency entry: 199201665
- Sigstore integration time:
-
Permalink:
AdrienVannson/doreisa@d984b45f94bf733dbcaca66c457624caa9239cdc -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/AdrienVannson
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@d984b45f94bf733dbcaca66c457624caa9239cdc -
Trigger Event:
push
-
Statement type: