Skip to main content

HPC Workflows & Edge Service

Project description

GitHub tag (latest by date) Documentation Status PyPI - License

Balsam: HPC Workflows & Edge Service

Balsam makes it easy to manage large computational campaigns on a supercomputer. Instead of writing and submitting job scripts to the batch scheduler, you send individual tasks (application runs) to Balsam. The service takes care of reserving compute resources in response to changing workloads. The launcher fetches tasks and executes the workflow on its allocated resources.

Balsam is designed to minimize user "buy-in" and cognitive overhead. You don't have to learn an API or write any glue code to acheive throughput with existing applications. On systems with Balsam installed, it's arguably faster and easier for a beginner to run an ensemble using Balsam than by writing an ensemble job script:

$ balsam app --name SayHello --executable "echo hello,"
$ for i in {1..10}
> do
>  balsam job --name hi$i --workflow test --application SayHello --args "world $i"
> done
$ balsam submit-launch -A Project -q Queue -t 5 -n 2 --job-mode=serial

Highlights

  • Applications require zero modification and run as-is with Balsam
  • Launch MPI applications or pack several non-MPI tasks-per-node
  • Run apps on bare metal or inside a Singularity container
  • Flexible Python API and command-line interfaces for workflow management
  • Execution is load balanced and resilient to task faults. Errors are automatically recorded to database for quick lookup and debugging of workflows
  • Scheduled jobs can overlap in time; launchers cooperatively consume work from the same database
  • Multi-user workflow management: collaborators on the same project can add tasks and submit launcher jobs using the same database

The Balsam API enables a variety of scenarios beyond the independent bag-of-tasks:

  • Add task dependencies to form DAGs
  • Program dynamic workflows: some tasks spawn or kill other tasks at runtime
  • Remotely submit workflows, track their progress, and coordinate data movement tasks

Read the Balsam Documentation online at balsam.readthedocs.io!

Existing site-wide installations

Balsam is deployed in a public location at the following sites. On these systems, it's not necessary to install Balsam yourself:

Location System Command
ALCF Theta module load balsam

Installation

Prerequisites

Balsam requires Python 3.6 or later. Preferably, set up an isolated virtualenv or conda environment for Balsam. It's no problem if some applications in your workflow run in different Python environments. You will need setuptools 39.2 or newer:

$ pip install --upgrade pip setuptools

Some Balsam components require mpi4py, so it is best to install Balsam in an environment with mpi4py already in place and configured for your platform. At the minimum, a working MPI implementation and mpicc compiler wrapper should be in the search path, in which case the mpi4py dependency will automatically build and install.

cython is also used to compile some CPU-intensive portions of the Balsam service. While the Cython dependency will also be installed if it's absent, it is preferable to have an existing version built with your platform-tuned compiler wrappers.

Finally, Balsam requires PostgreSQL version 9.6.4 or newer to be installed. You can verify that PostgreSQL is in the search PATH and the version is up-to-date with:

$ pg_ctl --version

It's very easy to get the PostgreSQL binaries if you don't already have them. Simply adding the PostgreSQL bin/ to your search PATH should be enough to use Balsam without having to bother a system administrator.

Quick setup

$ pip install balsam-flow
$ balsam init ~/myWorkflow
$ source balsamactivate myWorkflow

Once a Balsam database is activated, you can use the command line to manage your workflows:

$ balsam app --name SayHello --executable "echo hello,"
$ balsam job --name hi --workflow test --application SayHello --args "World!"
$ balsam submit-launch -A MyProject -q DebugQueue -t 5 -n 1 --job-mode=mpi
$ watch balsam ls   #  follow status in realtime from command-line

Keep reading the Balsam Documentation online at balsam.readthedocs.io!

Citing Balsam

If you are referencing Balsam in a publication, please cite the following paper:

  • M. Salim, T. Uram, J.T. Childers, P. Balaprakash, V. Vishwanath, M. Papka. Balsam: Automated Scheduling and Execution of Dynamic, Data-Intensive HPC Workflows. In Proceedings of the 8th Workshop on Python for High-Performance and Scientific Computing. ACM Press, 2018.

BSD 3-Clause License

Copyright (c) 2019, UChicago Argonne LLC All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of UChicago Argonne LLC nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

balsam-flow-0.3.7.tar.gz (205.1 kB view hashes)

Uploaded Source

Built Distribution

balsam_flow-0.3.7-cp37-cp37m-macosx_10_14_x86_64.whl (203.6 kB view hashes)

Uploaded CPython 3.7m macOS 10.14+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page