Skip to main content

Easy creation of workflows for recursive and farming HPC jobs

Project description

lemming

Lemmings is a 1991 video game where the player try to herd small animals, the "lemmings" out of a a 2D puzzle. Lemmings are clueless about their surroundings, walk blindly, and will eventually fall, burn, be crushed, ... well die, unless the player personally take care of them. The "Lemmings Jobs ", introduced here, are the same : by nature, these unsupervised job submission often end up in dramatic failures. Human oversight is compulsory when you are dealing with chained runs.

Lemmings

Idea

Lemmings is an open-source code designed to simplify the submission of multiple inter-dependent jobs on the schedulers of HPC clusters. While originally developed within the context of Computational Fluid Dynamics (CFD) applications, it is adapted to many recursive jobs. A farming mode is present to help the replication of these recursive jobs for a parametric study.

Installation

Lemmings is open-source and can be pip-installed :

pip install lemmings-hpc

End user POV

The end-user of lemmings is someone making a lot of simulations with a repetitive pattern. This repetition (eg. resubmit the job until simulated time reaches 1ms) is automated by a lemmings "workflow", a python file gathering all the logic of the application. This "workflow" was created by a super user using lemmings.

Here The end-user (John Doe) adds the workflow (sandcastle) file where he usually launches the run, then run the lemmings run command:

>lemmings run --machine-file sandbox.yml --job-prefix funtask sandcastle
INFO - 
##############################
Starting Lemmings 0.8.0...
##############################

INFO -     Job name     :funtask_PAJI77
INFO -     Loop         :1
INFO -     Status       :start
INFO -     Worflow path :/Users/johndoe/productionpath/sandcastle.py
INFO -     Imput path   :/Users/johndoe/productionpath/sandcastle.yml
INFO -     Machine path :/Users/johndoe/productionpath/sandbox.yml
INFO -     Farming mode :False
INFO -     Lemmings START (1/3)
INFO -          Check on startTrue (False -> Exit)
INFO -          Prior to job
INFO -     Lemmings SPAWN (2/3)
INFO -          Prepare run
INFO -          Submit batch 74148 
INFO -          Submit batch post job 74149

This execution will be called funtask_PAJI77 and will automatically submit runs through the job schedulers. On the job scheduler, he will find something like

+----------------+---------------+-------+----------+-------------------+---------+
|    job name    |     queue     | pid   |  state   |    last update    |  after  |
+----------------+---------------+-------+----------+-------------------+---------+
| funtask_PAJI77 |  long00:00:30 | 74148 |   done   | 06/13/22 15:22:52 |    -    |
| funtask_PAJI77 | short00:00:10 | 74149 |  running | 06/13/22 15:22:53 |  74148  |
+----------------+---------------+-------+----------+-------------------+---------+

Here jobs funtask_PAJI77_74148 and funtask_PAJI77_74149 are the two first dependent jobs of the workflow, but more will come. The decision to re-submit and the creation of the next job will be handled by funtask_PAJI77_74149 after completion. Therefore Lemmings does not "book" consecutive PID on start, only the next jobs are queued.

Finally lemmings is not moving/hiding log files automatically. By actively limiting such "black magic", it enforces an experience similar to manual re-submission

Creating a workflow

A super-user creates a workflow by injecting code into some parts of a baseline Loop. The default, simplified, lemmings job follows this algorithm:

                +-----------+                     +------------+True  
Start---------->|Prepare Run+--->Job submission--->Check on end+----------->Happy
            ^   +-----------+                     +------+-----+             End
            |                                            |
            |                                            |False
            |                                            |
            |                                            |
            +--------------------------------------------+                          

By adding code to Prepare Run phase (updates of input file) and to Check on end (when to stop the job), the super-user can customize it to his needs. Follows the HowTos for an extended explanation.

Resources

Lemmings documentation can be found following this link : lemmings documentation

Acknowledgements

Lemmings is a service created in the EXCELLERAT Center Of Excellence and is continued as part of the COEC Center Of Excellence. Both projects are funded by the European community.

logo

logo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lemmings_hpc-0.9.1.tar.gz (140.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lemmings_hpc-0.9.1-py3-none-any.whl (208.5 kB view details)

Uploaded Python 3

File details

Details for the file lemmings_hpc-0.9.1.tar.gz.

File metadata

  • Download URL: lemmings_hpc-0.9.1.tar.gz
  • Upload date:
  • Size: 140.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.0

File hashes

Hashes for lemmings_hpc-0.9.1.tar.gz
Algorithm Hash digest
SHA256 d8a184aab91d1ad7bd683705f94166ba8c33d9117d863f47976ec89a39a04fa7
MD5 2e60bfd980dab5ad250ad176a55b9dc0
BLAKE2b-256 5887f9d236b29a560ed7af9f795955440d4dbbe765410fac62073b8e811139eb

See more details on using hashes here.

File details

Details for the file lemmings_hpc-0.9.1-py3-none-any.whl.

File metadata

  • Download URL: lemmings_hpc-0.9.1-py3-none-any.whl
  • Upload date:
  • Size: 208.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.0

File hashes

Hashes for lemmings_hpc-0.9.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e4d9c9061e68add83c5fb3c6a336a1a9a4bdb7a3a147098fa8ee34ee134bf69
MD5 20647de4a630dd0058cf8ae14a4eccb3
BLAKE2b-256 cade21481fb4c6eaf2d39b5287c33bcdac0a940f9905da39b9be617d787a576a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page