Skip to main content

Massively Parallel operations made easy

Project description

Rationale

  • You wrote a program a.out with some parameters

  • You need to explore the space of parameters

Minionize is a solution to spawn a legion of a.out in a massively parallel manner. By minionizing your program, its inputs can be taken from various sources (e.g filesystem, pub/sub). Also inputs can be acked or redelivered to another minions.

How does it work

A classical pattern to do the above is to apply the master/worker pattern where a master gives tasks to workers. Workers repeatedly fetch a new task from a queue , run it and report back somewhere its status.

Minionize encapsulates a.out so that it can takes its inputs from a queue.

Currently we support:

  • execo based queue: the queue is stored in a shared file system in your cluster (actually, there’s no master)

  • Google pub/sub based queue: the queue is hosted by Google

Some examples

  • Simplest use: In this case the received params are appended to the minionized program. If you need more control on the params see below.

    • with Execo engine:

      # Create the queue of params
      # You'll have to run this prior to launching your minions (adapt to
      # your need / make a regular script)
      $) python -c "from execo_engine.sweep import ParamSweeper; ParamSweeper('sweeps', sweeps=range(10), save_sweeps=True)"
      
      # start your minions
      $) MINION_ENGINE=execo minionize echo hello
      hello 0
      hello 1
      hello 2
      hello 3
      hello 4
      hello 5
      hello 6
      hello 7
      hello 8
      hello 9
  • On a OAR cluster (Igrida/Grid5000):

    • Generate the queue for example with Execo

      python -c "from execo_engine.sweep import ParamSweeper; ParamSweeper('sweeps', sweeps=range(1000), save_sweeps=True)"
      • Create your oar scan script:

      #!/usr/bin/env bash
      
      #OAR -n kpd
      #OAR -l nodes=1,walltime=1:0:0
      #OAR -t besteffort
      #OAR -t idempotent
      
      # oarsub --array 10 -S ./oar.sh
      
      set -eux
      
      pip install minionize
      
      minionize echo "hello from $OAR_JOB_ID"
      • Start your minions

      echo "MINION_ENGINE=execo" > .env
      oarsub --array 10 -S ./oar.sh
      • Example of output:

      $) cat OAR.1287856.stdout
      [...]
      hello from 1287856 135
      hello from 1287856 139
      hello from 1287856 143
      hello from 1287856 147
      hello from 1287856 151
      hello from 1287856 155
      hello from 1287856 159
      hello from 1287856 163
      hello from 1287856 167
      [...]
  • Custom parameters handling:

    The params sent to you program can be anything (e.g a python dict). In some cases (many actually), you’ll need to transform these params to something that you program can understand. So you’ll need to minionize your program by writing a custom Callback.

    examples/process.py: gives you a glimpse on writing custom callbacks.

    • use it with Execo engine:

    # generate the queue of task
    python -c "from execo_engine.sweep import ParamSweeper, sweep; ParamSweeper('sweeps', sweeps=sweep({'a': [0, 1], 'b': ['x', 't"]}), save_sweeps=True)"
    
    # start your minions
    MINION_ENGINE=execo python process.py
    • use it with GooglePubSub engine:

    # start your minions
    MINION_ENGINE=google \
    GOOGLE_PROJECT_ID=gleaming-store-288314  \
    GOOGLE_TOPIC_ID=TEST \
    GOOGLE_SUBSCRIPTION=tada \
    GOOGLE_APPLICATION_CREDENTIALS=~/.gcp/gleaming-store-288314-2444b0d20a52.json \
    python process.py

Roadmap

  • Easy integration as docker entrypoint

  • Minionize python function (e.g @minionize decorator)

  • Support new queues (Apache pulsar, Redis stream, RabbitMQ, Kakfa …)

  • Support new abstractions to run container based application (docker, singularity…)

  • Automatic encapsulation using a .minionize.yml

  • Minions statistics

  • Keep in touch (matthieu dot simonin at inria dot fr)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

minionize-0.1.6.tar.gz (11.1 kB view hashes)

Uploaded Source

Built Distribution

minionize-0.1.6-py3-none-any.whl (10.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page