Skip to main content

Yet another computing scheduler and cloud orchestration engine

Project description

Yet another computing scheduler & cloud orchestration engine

DOI PyPI FOSSA Status

Yascheduler is a simple job scheduler designed for submitting scientific calculations and copying back the results from the computing clouds.

Currently it supports several scientific simulation codes in chemistry and solid state physics. Any other scientific simulation code can be supported via the declarative control template system (see yascheduler.conf settings file). There is an example dummy C++ code with its configuration template.

Installation

Use pip and PyPI: pip install yascheduler.

The last updates and bugfixes can be obtained cloning the repository:

git clone https://github.com/tilde-lab/yascheduler.git
pip install yascheduler/

The installation procedure creates the configuration file located at /etc/yascheduler/yascheduler.conf. The file contains credentials for Postgres database access, used directories, cloud providers and scientific simulation codes (called engines). Please check and amend this file with the correct credentials. The database and the system service should then be initialized with yainit script.

Usage

from yascheduler import Yascheduler

yac = Yascheduler()
label = "test assignment"
engine = "pcrystal"
struct_input = str(...)  # simulation control file: crystal structure
setup_input = str(...)  # simulation control file: main setup, can include struct_input
result = yac.queue_submit_task(
    label, {"fort.34": struct_input, "INPUT": setup_input}, engine
)
print(result)

Or run directly in console with yascheduler (use a key -l DEBUG to change the log level).

Supervisor config reads e.g.:

[program:scheduler]
command=/usr/local/bin/yascheduler
user=root
autostart=true
autorestart=true
stderr_logfile=/data/yascheduler.log
stdout_logfile=/data/yascheduler.log

File paths can be set using the environment variables:

  • YASCHEDULER_CONF_PATH

    Configuration file.

    Default: /etc/yascheduler/yascheduler.conf

  • YASCHEDULER_LOG_PATH

    Log file path.

    Default: /var/log/yascheduler.log

  • YASCHEDULER_PID_PATH

    PID file.

    Default: /var/run/yascheduler.pid

Configuration File Reference

Database Configuration [db]

Connection to a PostgreSQL database.

  • user

    The username to connect to the PostgreSQL server with.

  • password

    The user password to connect to the server with. This parameter is optional

  • host

    The hostname of the PostgreSQL server to connect with.

  • port

    The TCP/IP port of the PostgreSQL server instance.

    Default: 5432

  • database

    The name of the database instance to connect with.

    Default: Same as user

Local Settings [local]

  • data_dir

    Path to root directory of local data files. Can be relative to the current working directory.

    Default: ./data (but it's always a good idea to set up explicitly!)

    Example: /srv/yadata

  • tasks_dir

    Path to directory with tasks results.

    Default: tasks under data_dir

    Example: %(data_dir)s/tasks

  • keys_dir

    Path to directory with SSH keys. Make sure it only contains the private keys.

    Default: keys under data_dir

    Example: %(data_dir)s/keys

  • engines_dir

    Path to directory with engines repository.

    Default: engines under data_dir

    Example: %(data_dir)s/engines

  • webhook_reqs_limit

    Maximum number of in-flight webhook http requests.

    Default: 5

  • conn_machine_limit

    Maximum number of concurrent SSH connection's connect requests.

    Default: 10

  • conn_machine_pending

    Maximum number of pending SSH connection's connect requests.

    Default: 10

  • allocate_limit

    Maximum number of concurrent task or node allocation requests.

    Default: 20

  • allocate_pending

    Maximum number of pending task or node allocation requests.

    Default: 1

  • consume_limit

    Maximum number of concurrent task's results downloads.

    Default: 20

  • consume_pending

    Maximum number of pending task's results downloads.

    Default: 1

  • deallocate_limit

    Maximum number of concurrent node deallocation requests.

    Default: 5

  • deallocate_pending

    Maximum number of pending node deallocation requests.

    Default: 1

Remote Settings [remote]

  • data_dir

    Path to root directory of data files on remote node. Can be relative to the remote current working directory (usually $HOME).

    Default: ./data

    Example: /src/yadata

  • tasks_dir

    Path to directory with tasks results on remote node.

    Default: tasks under data_dir

    Example: %(data_dir)s/tasks

  • engines_dir

    Path to directory with engines on remote node.

    Default: engines under data_dir

    Example: %(data_dir)s/engines

  • user

    Default ssh username.

    Default: root

  • jump_user

    Username of default SSH jump host (if used).

  • jump_host

    Host of default SSH jump host (if used).

Providers [clouds]

All cloud providers settings are set in the [cloud] group. Each provider has its own settings prefix.

These settings are common to all the providers:

  • *_max_nodes

    The maximum number of nodes for a given provider. The provider is not used if the value is less than 1.

  • *_user

    Per provider override of remote.user.

  • *_priority

    Per provider priority of node allocation. Sorted in descending order, so the cloud with the highest value is the first.

  • *_idle_tolerance

    Per provider idle tolerance (in seconds) for deallocation of nodes.

    Default: different for providers, starting from 120 seconds.

  • *_jump_user

    Username of this cloud SSH jump host (if used).

  • *_jump_host

    Host of this cloud SSH jump host (if used).

Hetzner

Settings prefix is hetzner.

  • hetzner_token

    API token with Read & Write permissions for the project.

  • hetzner_server_type

    Server type (size).

    Default: cx52

  • hetzner_image_name

    Image name for new nodes.

    Default: debian-11

Azure

Azure Cloud should be pre-configured for yascheduler. See Cloud Providers.

Settings prefix is az.

  • az_tenant_id

    Tenant ID of Azure Active Directory.

  • az_client_id

    Application ID.

  • az_client_secret

    Client Secret value from the Application Registration.

  • az_subscription_id

    Subscription ID

  • az_resource_group

    Resource Group name.

    Default: yascheduler-rg

  • az_user

    SSH username. root is not supported.

  • az_location

    Default location for resources.

    Default: westeurope

  • az_vnet

    Virtual network name.

    Default: yascheduler-vnet

  • az_subnet

    Subnet name.

    Default: yascheduler-subnet

  • az_nsg

    Network security group name.

    Default: yascheduler-nsg

  • az_vm_image

    OS image name.

    Default: Debian

  • az_vm_size

    Machine size.

    Default: Standard_B1s

UpCloud

Settings prefix is upcloud.

  • upcloud_login

    Username.

  • upcloud_password

    Password.

Engines [engine.*]

Supported engines should be defined in the section(s) [engine.name]. The name is alphanumeric string to represent the real engine name. Once set, it cannot be changed later.

  • platforms

    List of supported platform, separated by space or newline.

    Default: debian-10 Example: mY-cOoL-OS another-cool-os

  • platform_packages

    A list of required packages, separated by space or newline, which will be installed by the system package manager.

    Default: [] Example: openmpi-bin wget

  • deploy_local_files

    A list of filenames, separated by space or newline, which will be copied from local %(engines_dir)s/%(engine_name)s to remote %(engines_dir)s/%(engine_name)s. Conflicts with deploy_local_archive and deploy_remote_archive.

    Example: dummyengine

  • deploy_local_archive

    A name of the local archive (.tar.gz) which will be copied from local %(engines_dir)s/%(engine_name)s to the remote machine and then unarchived to the %(engines_dir)s/%(engine_name)s. Conflicts with deploy_local_archive and deploy_remote_archive.

    Example: dummyengine.tar.gz

  • deploy_remote_archive

    The url to the engine arhive (.tar.gz) which will be downloaded to the remote machine and then unarchived to the %(engines_dir)s/%(engine_name)s. Conflicts with deploy_local_archive and deploy_remote_archive.

    Example: https://example.org/dummyengine.tar.gz

  • spawn

    This command is used by the scheduler to initiate calculations.

    cp {task_path}/INPUT OUTPUT && mpirun -np {ncpus} --allow-run-as-root \
      -wd {task_path} {engine_path}/Pcrystal >> OUTPUT 2>&1
    

    Example: {engine_path}/gulp < INPUT > OUTPUT

  • check_pname

    Process name used to check that the task is still running. Conflicts with check_cmd.

    Example: dummyengine

  • check_cmd

    Command used to check that the task is still running. Conflicts with check_pname. See also check_cmd_code.

    Example: ps ax -ocomm= | grep -q dummyengine

  • check_cmd_code

    Expected exit code of command from check_cmd. If code matches than task is running.

    Default: 0

  • sleep_interval

    Interval in seconds between the task checks. Set to a higher value if you are expecting long running jobs.

    Default: 10

  • input_files

    A list of task input file names, separated by a space or new line, that will be copied to the remote directory of the task before it is started. The first input is considered as the main input.

    Example: INPUT sibling.file

  • output_files

    A list of task output file names, separated by a space or new line, that will be copied from the remote directory of the task after it is finished.

    Example: INPUT OUTPUT

Aiida Integration

See the detailed instructions for the MPDS-AiiDA-CRYSTAL workflows as well as the ansible-mpds repository. In essence:

ssh aiidauser@localhost # important
reentry scan
verdi computer setup
verdi computer test $COMPUTER
verdi code setup

License

FOSSA Status

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yascheduler-1.3.2.tar.gz (53.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yascheduler-1.3.2-py3-none-any.whl (68.8 kB view details)

Uploaded Python 3

File details

Details for the file yascheduler-1.3.2.tar.gz.

File metadata

  • Download URL: yascheduler-1.3.2.tar.gz
  • Upload date:
  • Size: 53.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for yascheduler-1.3.2.tar.gz
Algorithm Hash digest
SHA256 32bc4dd21654b393287dbe126ebb8ca3a3380aca1ec39b4c55d9256c6a1acdfd
MD5 d3a857221a4c6db9ec48fd4a6c502887
BLAKE2b-256 50de290ebf27fea0b75503dd93f7dc99d2ce48b34375b5fe5a5a59c9ca7ee57b

See more details on using hashes here.

File details

Details for the file yascheduler-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: yascheduler-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 68.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for yascheduler-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d29d698d82b170ae9cc8e4ee91e00a00060649ea68f9d36ca59620226c9efd9b
MD5 01a0c8a6b4a7ea5336bc41fa17de8a30
BLAKE2b-256 87f43f09ab891cea93d7228ffee8de49cfbded29442b46ff9acfb646b1526f7e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page