Skip to main content

Execute python functions on a remote supercomputer

Project description

FirecrestExecutor: A supercomputer at your pythonic fingertips

Execute Python functions on a remote supercomputer. Sounds difficult? It is not!

The package builds on FirecREST, a lightweight REST API for accessing HPC resources and the associated pyFirecREST Python library. It abstracts these resources in the form of a standard Python executor. Once the executor is created, which requires credentials and information about the remote supercomputer, you can submit Python functions to be executed remotely. The executor will transparently take care of the details, such as creating a job script, submitting it to the scheduler, and waiting for the results.

A simple example

from firecrest-executor import FirecrestExecutor
import logging


# functions to be executed remotely
def square_number(x):
    from math import pow

    return pow(x, 2)


def report_hostname():
    import subprocess
    import os

    cluster = os.environ.get("CLUSTER_NAME", "unknown")
    hostname = subprocess.check_output(["hostname"], text=True).strip()
    return f"Success testing execution on {hostname} of {cluster}!"


print("Executing remotely, stay tuned...")

# Use the executor to remotely execute (asynchronously) the functions defined above.
with FirecrestExecutor(
    working_dir="/users/vjoost/FirecrestExecutorDemo/",
    sbatch_options=[
        "--job-name=FirecrestExecutor",
        "--time=00:10:00",
        "--nodes=1",
        "--partition=normal",
    ],
    srun_options=["--environment=/users/vjoost/FirecrestExecutorDemo/demo.toml"],
    sleep_interval=5,
    logger_level=logging.ERROR,
) as executor:
    # A quick test to see if the executor is working
    print(executor.submit(report_hostname).result())
    # and let's compute the square of some numbers
    numbers = range(2, 5)
    print("Let's compute squares of 2..4: ", list(executor.map(square_number, numbers)))

This results in the expected output:

$ python simple.py 
Executing remotely, stay tuned...
Success testing execution on nid005463 of clariden!
Let's compute squares of 2..4:  [4.0, 9.0, 16.0]

Getting started

Install the firecrest-executor

Directly from PyPI:

pip install firecrest-executor

or clone from github:

git clone https://github.com/vondele/firecrest-executor.git
cd firecrest-executor
pip install -e .[examples]

Enable FirecREST

This package requires you have access to a supercomputer that supports the firecREST API. Inquire with your HPC support team if you are unsure. For the Swiss national supercomputing center (CSCS) this is the case for all clusters see their documentation. Follow the process to obtain the necessary clients, tokens, and credentials to be able to access the system using the firecREST API and define the following environment variables:

        - FIRECREST_CLIENT_ID
        - FIRECREST_CLIENT_SECRET
        - AUTH_TOKEN_URL
        - FIRECREST_URL
        - FIRECREST_SYSTEM
        - FIRECREST_ACCOUNT

Ensure consistent environments

This package requires the same version of Python and of its dependencies (in particular the package to serialize Python functions and variables dill) to be available locally and remotely. Containers to the rescue (or manage your environment carefully with any other tool)! At CSCS the container engine allows for passing the --environment=foo.toml flag to srun, to start commands in a container with the specified settings. Hence, use an equivalent container locally and remotely. This container image should also contain the Python packages needed by any of the functions you want to execute remotely.

Code with remote execution in mind

Python functions that will be execute remotely should be serializable with dill and be executable without extra context in a fresh Python shell. Avoid access to global variables (in particular things like locks). Explicitly import all modules and functions that are used in the remote function.

Currently, every function call creates a new job, allocates at least a full node, and hence is not suitable for very small tasks. The overhead of starting a job is a few seconds/minutes at least. This approach is thus suitable for tasks that are computationally intensive.

The function arguments and return values are serialized and passed between the systems. Currently, these can not exceed a few 100kB. The functions can not rely on terminal input and output is stored remotely (typically slurm-<jobid>.out). Wrap functions, capture, and return output if needed.

Advanced usage and caveats

Containers allow for mounting filesystems, and as such these Python functions can load and save large persistent datasets. The firecrest api allows for up and downloading data.

The code prioritizes returning a result, even if e.g. node failure might cause a job to fail. In these cases, the job is transparently resubmitted and the result is returned when available. If the job modifies state on the remote system, this might lead to unexpected results. Similarly, the executor handles failure of the API by repeated calling, after a few seconds sleep.

Exceptions in the remotely executed function are currently not propagated to the caller.

The executor has a few knobs for configuration, such as the logging level, the maximum number of concurrent jobs, and the sleep interval between checks for status. Passing an explicit environment variable, allows for bypassing the OS environment variables, and e.g. for creating different executors for different clusters.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

firecrest_executor-0.3.0.tar.gz (13.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

firecrest_executor-0.3.0-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file firecrest_executor-0.3.0.tar.gz.

File metadata

  • Download URL: firecrest_executor-0.3.0.tar.gz
  • Upload date:
  • Size: 13.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for firecrest_executor-0.3.0.tar.gz
Algorithm Hash digest
SHA256 c35b3d5de727de2aefe901481f2fcfada8ce632451f6c8a435d709ae556a663c
MD5 494daf1b82c776b0814d9f6643efaf66
BLAKE2b-256 f4e8047f98c09d550fbdf7e0600d52978907dc92828e6411b18b56eedf33a92c

See more details on using hashes here.

File details

Details for the file firecrest_executor-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for firecrest_executor-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 463331ccd311e55d6e8c5975f2bbb3271c4af55a40c1070536564ed3a811a842
MD5 597886df253c82c3b31edc0d74c324d9
BLAKE2b-256 cfc24a52d513c64b89bf541ebb88448acfb4488611192be286497670745e64d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page