Skip to main content

Execute python functions on a remote supercomputer

Project description

FirecrestExecutor: A supercomputer at your pythonic fingertips

Execute Python functions on a remote supercomputer. Sounds difficult? It is not!

The package builds on FirecREST, a lightweight REST API for accessing HPC resources and the associated pyFirecREST Python library. It abstracts these resources in the form of a standard Python executor. Once the executor is created, which requires credentials and information about the remote supercomputer, you can submit Python functions to be executed remotely. The executor will transparently take care of the details, such as creating a job script, submitting it to the scheduler, and waiting for the results.

A simple example

from firecrest-executor import FirecrestExecutor
import logging


# functions to be executed remotely
def square_number(x):
    from math import pow

    return pow(x, 2)


def report_hostname():
    import subprocess
    import os

    cluster = os.environ.get("CLUSTER_NAME", "unknown")
    hostname = subprocess.check_output(["hostname"], text=True).strip()
    return f"Success testing execution on {hostname} of {cluster}!"


print("Executing remotely, stay tuned...")

# Use the executor to remotely execute (asynchronously) the functions defined above.
with FirecrestExecutor(
    working_dir="/users/vjoost/FirecrestExecutorDemo/",
    sbatch_options=[
        "--job-name=FirecrestExecutor",
        "--time=00:10:00",
        "--nodes=1",
        "--partition=normal",
    ],
    srun_options=["--environment=/users/vjoost/FirecrestExecutorDemo/demo.toml"],
    sleep_interval=5,
    logger_level=logging.ERROR,
) as executor:
    # A quick test to see if the executor is working
    print(executor.submit(report_hostname).result())
    # and let's compute the square of some numbers
    numbers = range(2, 5)
    print("Let's compute squares of 2..4: ", list(executor.map(square_number, numbers)))

This results in the expected output:

$ python simple.py 
Executing remotely, stay tuned...
Success testing execution on nid005463 of clariden!
Let's compute squares of 2..4:  [4.0, 9.0, 16.0]

Getting started

Install the firecrest-executor

Directly from PyPI:

pip install firecrest-executor

or clone from github:

git clone https://github.com/vondele/firecrest-executor.git
cd firecrest-executor
pip install -e .[examples]

Enable FirecREST

This package requires you have access to a supercomputer that supports the firecREST API. Inquire with your HPC support team if you are unsure. For the Swiss national supercomputing center (CSCS) this is the case for all clusters see their documentation. Follow the process to obtain the necessary clients, tokens, and credentials to be able to access the system using the firecREST API and define the following environment variables:

        - FIRECREST_CLIENT_ID
        - FIRECREST_CLIENT_SECRET
        - AUTH_TOKEN_URL
        - FIRECREST_URL
        - FIRECREST_SYSTEM
        - FIRECREST_ACCOUNT

Ensure consistent environments

This package requires the same version of Python and of its dependencies (in particular the package to serialize Python functions and variables dill) to be available locally and remotely. Containers to the rescue (or manage your environment carefully with any other tool)! At CSCS the container engine allows for passing the --environment=foo.toml flag to srun, to start commands in a container with the specified settings. Hence, use an equivalent container locally and remotely. This container image should also contain the Python packages needed by any of the functions you want to execute remotely.

Code with remote execution in mind

Python functions that will be execute remotely should be serializable with dill and be executable without extra context in a fresh Python shell. Avoid access to global variables (in particular things like locks). Explicitly import all modules and functions that are used in the remote function.

Currently, every function call creates a new job, allocates at least a full node, and hence is not suitable for very small tasks. The overhead of starting a job is a few seconds/minutes at least. This approach is thus suitable for tasks that are computationally intensive.

The function arguments and return values are serialized and passed between the systems. Currently, these can not exceed a few 100kB. The functions can not rely on terminal input and output is stored remotely (typically slurm-<jobid>.out). Wrap functions, capture, and return output if needed.

Advanced usage and caveats

Containers allow for mounting filesystems, and as such these Python functions can load and save large persistent datasets. The firecrest api allows for up and downloading data.

The code prioritizes returning a result, even if e.g. node failure might cause a job to fail. In these cases, the job is transparently resubmitted and the result is returned when available. If the job modifies state on the remote system, this might lead to unexpected results. Similarly, the executor handles failure of the API by repeated calling, after a few seconds sleep.

Exceptions in the remotely executed function are currently not propagated to the caller.

The executor has a few knobs for configuration, such as the logging level, the maximum number of concurrent jobs, and the sleep interval between checks for status. Passing an explicit environment variable, allows for bypassing the OS environment variables, and e.g. for creating different executors for different clusters.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

firecrest_executor-0.3.1.tar.gz (13.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

firecrest_executor-0.3.1-py3-none-any.whl (14.0 kB view details)

Uploaded Python 3

File details

Details for the file firecrest_executor-0.3.1.tar.gz.

File metadata

  • Download URL: firecrest_executor-0.3.1.tar.gz
  • Upload date:
  • Size: 13.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for firecrest_executor-0.3.1.tar.gz
Algorithm Hash digest
SHA256 f255df684789eb4285027f625422bf06f75a832f4370d72eca9cc4a9bf01c2cb
MD5 2d5cbf6f517f4b8cc10daa6179345963
BLAKE2b-256 e9604540128fd56159a2a8cd89efa5e99f314b60a843e77365f11f99b1a4fdbd

See more details on using hashes here.

File details

Details for the file firecrest_executor-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for firecrest_executor-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 be535fa450e9ced2a9c1eb983c47c0639240c855b2ab91d62f69623b95a51777
MD5 501998da72ccfa682eaab803e926b8a3
BLAKE2b-256 4e544ec309398622eee9f828e469e334a5996a1f932a564723d156ec32b1df14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page