Skip to main content

Autoscaling GitHub Actions Runners Using Hetzner Cloud

Project description

test bug

Autoscaling GitHub Actions Runners Using Hetzner Cloud

The github-runners service program starts and monitors queued up jobs for GitHub Actions workflows. When a new job is queued up, it creates a new Hetzner Cloud server instance that provides an ephemeral GitHub Actions runner. Each server instance is automatically powered off when job completes and then powered off servers are automatically deleted. Both x64 and arm64 runners are supported.

❗Warning:

This program is provided on “AS IS” basis without warranties or conditions of any kind. See LICENSE. Use it at your own risk. Manual monitoring is required to make sure server instances are cleaned up properly and costs are kept under control.

Costs depend on the server type, number of jobs and execution time. For each job a new server instance is created to avoid any cleanup. Server instances are not shared between any jobs.

✋ Note:

Currently Hetzner Cloud server instances are billed on hourly basis. So a job that takes 1 min will be billed the same way as for a job that takes 59 minutes. Therefore, the minimal cost for any job is the cost of the server for 1 hour plus the cost for one public IPv4 address.

Features

  • cost efficient on-demand runners using Hetzner Cloud

  • supports both x64 and ARM64 runners

  • supports specifying custom runner types using job labels

  • simple configuration

  • self-contained and can deploy and manage itself on a cloud instance

Installation

pip3 install testflows.github.runners

Quick Start

Set environment variables corresponding to your GitHub repository and Hetzner Cloud project

export GITHUB_TOKEN=ghp_...
export GITHUB_REPOSITORY=vzakaznikov/github-runners
export HETZNER_TOKEN=GJzdc...

and then start github-runners program

github-runners
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Logging in to Hetzner Cloud
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Logging in to GitHub
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Getting repository vzakaznikov/github-runners
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Creating scale up service
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Creating scale down service
07/22/2023 08:20:38 PM   INFO   worker_2   create_server 🍀 Create server
...

or you can pass the required options inline as follows:

github-runners --github-token <GITHUB_TOKEN> --github-repository <GITHUB_REPOSITORY> --hetzner-token <HETZNER_TOKEN>

Installation From Sources

For development, you can install from sources as follows:

git clone https://github.com/testflows/Github-Runners.git
./package && ./install

Basic Configuration

By default, the program uses the following environment variables:

  • GITHUB_TOKEN

  • GITHUB_REPOSITORY

  • HETZNER_TOKEN

or you can specify these values using the following options:

  • –github-token

  • –github-repository

  • –hetzner-token

Specifying Runner Type

x64 Runners

The default server type is cx11 which is an Intel, 1 vCPU, 2GB RAM shared-cpu x64 instance.

✋ Note:

You can use –default-type option to set a different default server type.

You can specify different x64 server instance type by using the type-{name} runner label. The {name} must be a valid Hetzner Cloud server type name such as cx11, cpx21 etc.

For example, to use AMD, 3 vCPU, 4GB RAM shared-cpu x64 instance, you can define the runs-on as follows:

job-name:
   runs-on: [self-hosted, type-cpx21]

ARM64 Runners

The default, the server type is cx11, which is an Intel, 1 vCPU, 2GB RAM shared-cpu x64 instance. Therefore, in order to use ARM64 runners you must specify ARM64 server instance type by using the type-{name} runner label. The {name} must be a valid ARM64 Hetzner Cloud server type name such as cax11, cax21 etc. which correspond to the Ampere Altra, 2 vCPU, 4GB RAM and 4 vCPU, 8GB RAM shared-cpu ARM64 instances respectively.

For example, to use Ampere Altra, 4 vCPU, 8GB RAM shared-cpu ARM64 instance, you must define the runs-on as follows:

job-name:
   runs-on: [self-hosted, type-cax21]

Specifying Runner Location

By default, the default location of the server where the runner will be running is not specified. You can use the –default-location option to force specific default server location.

You can also use the in-{name} runner label to specify server location for a specific job. Where {name} must be a valid Hetzner Cloud location name such as ash for US, Ashburn, VA or fsn1 for Germany, Falkenstein.

For example,

job-name:
   runs-on: [self-hosted, type-cx11, in-ash]

Specifying Runner Image

By default, the default image of the server for the runner is ubuntu-22.04. You can use the –default-image option to force specific default server image.

You can also use the image-{name} runner label to specify server image for a specific job. Where {name} must be a valid Hetzner Cloud image such as ubuntu-22.04 or ubuntu-20.04.

For example,

job-name:
   runs-on: [self-hosted, type-cx11, in-ash, image-ubuntu-20.04]

SSH Key

All server instances that are created are accessed via SSH using the ssh utility and therefore you must provide a valid SSH key using the –ssh-key option. If the –ssh-key option is no specified, then the ~/.ssh/id_rsa.pub default key path will be used.

The SSH key will be automatically added to your project using the MD5 hash of the public key as the SSH key name.

❗Warning:

Given that each new SSH key is automatically added to your Hetzner project, you must manually delete them when no longer needed.

Most GitHub users already have an SSH key associated with the account. If you want to know how to add an SSH key, see Adding a new SSH key to your GitHub account article.

Generating New SSH Key

If you need to generate a new SSH key, see Generating a new SSH key and adding it to the ssh-agent article.

Cloud Deployment

If you are deploying the github-runners program as a cloud service using the github-runners <options> cloud deploy command, then after provisoning a new cloud server instance that will host the github-runners service, a new SSH key will be auto-generated to access the runners. The auto-generated key will be placed in /home/runner/.ssh/id_rsa, where runner is the user under which the github-runners service runs on the cloud instance. The auto-generated SSH key will be automatically added to your project using the MD5 hash of the public key as the SSH key name.

Running as a Service

You can run github-runners as a service.

✋ Note:

In order to install the service, the user that installed the module must have sudo privileges.

Installing and Uninstalling

After installation, you can use service install and service uninstall commands to install and uninstall the service.

✋ Note:

The options that are passed to the github-runners <options> service install command will be the same options with which the service will be executed.

export GITHUB_TOKEN=ghp_...
export GITHUB_REPOSITORY=testflows/github-runners
export HETZNER_TOKEN=GJzdc...

github-runners service install

The /etc/systemd/system/github-runners.service file is created with the following content.

✋ Note:

The service will use the User and the Group of the user executing the program.

/etc/systemd/system/github-runners.service:
[Unit]
Description=Autoscaling GitHub Actions Runners
After=multi-user.target
[Service]
User=1000
Group=1000
Type=simple
Restart=always
Environment=GITHUB_TOKEN=ghp_...
Environment=GITHUB_REPOSITORY=testflows/github-runners
Environment=HETZNER_TOKEN=GJ..
ExecStart=/home/user/.local/lib/python3.10/site-packages/testflows/github/runners/bin/github-runners --workers 10 --max-powered-off-time 20 --max-idle-runner-time 120 --max-runner-registration-time 60 --scale-up-interval 10 --scale-down-interval 10
[Install]
WantedBy=multi-user.target

Modifying Program Options

If you want to modify service program options you can stop the service, edit the /etc/systemd/system/github-runners.service file by hand, then reload service daemon, and start the service back up.

github-runners service stop
sudo vim /etc/systemd/system/github-runners.service
sudo systemctl daemon-reload
github-runners service start
github-runners service uninstall

Checking Status

After installation, you can check the status of the service using the service status command.

github-runners service status:
service status:
● github-runners.service - Autoscaling GitHub Actions Runners
     Loaded: loaded (/etc/systemd/system/github-runners.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2023-07-24 14:38:33 EDT; 1h 31min ago
   Main PID: 66188 (python3)
      Tasks: 3 (limit: 37566)
     Memory: 28.8M
        CPU: 8.274s
     CGroup: /system.slice/github-runners.service
             └─66188 python3 /usr/local/bin/github-runners --workers 10 --max-powered-off-time 20 --max-idle-runner-time 120 --max->

Jul 24 14:38:33 user-node systemd[1]: Started Autoscaling GitHub Actions Runners.
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Logging in to Hetzner >
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Logging in to GitHub
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Getting repository vza>
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Creating scale up serv>
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Creating scale down se>
lines 1-16/16 (END)

Manual Start and Stop

You can start and stop the service using the service start and service stop commands as follows:

github-runners service start
github-runners service stop

or using service system utility

sudo service github-runners start
sudo service github-runners stop

Checking Logs

You can get the logs for the service using the service logs command.

Use -f, –follow option to follow logs journal.

github-runners service logs -f
followed service log:
sudo github-runners service logs
Jul 24 16:12:14 user-node systemd[1]: Stopping Autoscaling GitHub Actions Runners...
Jul 24 16:12:14 user-node systemd[1]: github-runners.service: Deactivated successfully.
Jul 24 16:12:14 user-node systemd[1]: Stopped Autoscaling GitHub Actions Runners.
Jul 24 16:12:14 user-node systemd[1]: github-runners.service: Consumed 8.454s CPU time.
Jul 24 16:12:17 user-node systemd[1]: Started Autoscaling GitHub Actions Runners.
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Logging in to Hetzner Cloud
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Logging in to GitHub
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Getting repository vzakaznikov/github-runners
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Creating scale up service
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Creating scale down service

which is equivalent to the following journalctl command:

journalctl -u github-runners.service -f

You can dump the full log by omitting the -f, –follow option.

github-runners service logs
full service log:
Jul 24 14:24:42 user-node systemd[1]: Started Autoscaling GitHub Actions Runners.
Jul 24 14:24:42 user-node env[62771]: LANG=en_CA.UTF-8
Jul 24 14:24:42 user-node env[62771]: LANGUAGE=en_CA:en
Jul 24 14:24:42 user-node env[62771]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
Jul 24 14:24:42 user-node env[62771]: INVOCATION_ID=dc7b778f95fa4ccf95e4a4592b50d9e1
Jul 24 14:24:42 user-node env[62771]: JOURNAL_STREAM=8:328542
Jul 24 14:24:42 user-node env[62771]: SYSTEMD_EXEC_PID=62771
...

Running as a Cloud Service

Instead of running github-runners program locally as a standalone application or as a service. You can easily deploy github-runners to run on a Hetzner Cloud instance.

See -h, –help for all the available commands.

✋ Note:

By default, the server name where the github-runners service will be running is github-runners. If you want to use a custom server name, then you must use the cloud –name option for any cloud commands.

github-runners cloud -h

Deployment

You can deploy github-runners as a service to a new Hetzner Cloud server instance, that will be created for you automatically, using the cloud deploy command.

✋ Note:

The options that are passed to the github-runners <options> cloud deploy command will be the same options with which the service will be executed.

export GITHUB_TOKEN=ghp_...
export GITHUB_REPOSITORY=testflows/github-runners
export HETZNER_TOKEN=GJzdc...

github-runners deploy

The deploy command will use the following default values:

location:

ash

type:

cpx11

image:

ubuntu-22.04

You can customize deployment server location, type, and image using the –location, –type, and –image options.

github-runners deploy --location nbg1 --type cx11 --image ubuntu-22.04

The cloud instance that runs the github-runners service can either be x64 or ARM64 instance. By default, cpx11 AMD, 2 vCPU, 2GB RAM, shared-cpu x64 instance type is used.

Using ARM64 Instance

If you want to deploy the github-runners service to an ARM64 instance, then you must specify the instance type using the –type option.

✋ Note:

Currently Hetzner Cloud has ARM64 instances only available in Germany, Falkenstein (fsn1) location.

For example, to use Ampere Altra, 4 vCPU, 8GB RAM shared-cpu ARM64 instance, you must specify cax21 as the value of the –type as follows:

github-runners deploy --location fsn1 --type cax21 --image ubuntu-22.04
Using x64 Instance

By default, the cpx11 AMD, 2 vCPU, 2GB RAM, shared-cpu x64 instance type is used. If you want to use a different x64 instance then specify desired type using the –type option.

Cloud Service Logs

You can check logs for the github-runners service running on a cloud instance using the github-runners cloud logs command. Specify -f, –follow if you want to follow the logs journal.

For example,

dump the full log:
github-runners cloud logs
follow the logs journal:
github-runners cloud logs -f

Cloud Service Status

You can check the status of the github-runners service running on a cloud instance using the github-runners cloud status command.

For example,

github-runners cloud status

Stopping Cloud Service

You can manually stop the github-runners service running on a cloud instance using the github-runners cloud stop command.

github-runners cloud stop

Starting Cloud Service

You can manually start the github-runners service running on a cloud instance after it was being manually stopped using the github-runners cloud start command.

github-runners cloud start

Installing Cloud Service

You can manually force installation of the github-runners service running on a cloud instance using the github-runners cloud install command.

✋ Note:

Just like with the github-runners <options> service install command, the options that are passed to the github-runners <options> cloud install command will be the same options with which the service will be executed.

You can specify -f, –force option to force service re-installation if it is already installed.

github-runners <options> cloud install -f

Uninstalling Cloud Service

You can manually force uninstallation of the github-runners service running on a cloud instance using the github-runners cloud uninstall command.

github-runners cloud uninstall

Upgrading Cloud Service

You can manually upgrade the github-runners service package running on a cloud instance using the github-runners cloud upgrade command.

If specific ‘–version’ is specified then the testflows.github.runners package is upgraded to the specified version otherwise the version is upgraded to the latest available.

✋ Note:

The service is not re-installed during the package upgrade process. Instead, it is stopped before the upgrade and then started back up after the package upgrade is complete.

github-runners cloud upgrade --version <version>

Deleting Cloud Service

You can delete the github-runners cloud service and the cloud instance that is running on using the github-runners cloud delete command.

The cloud delete command, deletes the cloud service by first stopping the service and then deleting the server instance.

❗Warning:

The default server name where the cloud service is deployed is github-runners. Please make sure to specify the cloud –name option if you have deployed the service to a server with a different name.

For example,

default name:
github-runners cloud delete
custom name:
github-runners cloud --name <custom_name> delete

Scaling Up Runners

The program scales up runners by looking for any jobs that have queued status. For each such job, a corresponding Hetzner Cloud server instance is created with the following name:

github-runner-{job.run_id}

The server is configured using default setup and startup scripts. The runner name is set to be the same as the server name so that servers can be deleted for any idle runner that for some reason does not pick up a job for which it was created within the max-idle-runner-time period.

Note:

Given that the server name is fixed and specific for each job.run_id, if multiple github-runners are running in parallel then only 1 server will be created for a given job and any other attempts to create a server with the same name will be rejected by the Hetzner Cloud.

Also,

Note:

There is no guarantee that a given runner will pick the job with the exact run_id that caused it to be created. This is expected and because for each queued job a unique runner will be created the number of runners will be equal the number of jobs and therefore under normal conditions all jobs will be executed as expected.

Maximum Number of Runners

By default, the maximum number of runners and therefore the maximum number of server instances is not set and therefore is unlimited. You can set the maximum number of runners using the –max-runners option.

New Server

The new server is accessed using SSH. It boots up with the specified OS image and is configured using the setup and startup scripts.

Server Type:

The default server type is cx11 which is an Intel, 1 vCPU, 2GB RAM shared-cpu x64 instance.

You can specify different x64 server instance type by using the type-{name} runner label. The {name} must be a valid Hetzner Cloud server type name such as cx11, cpx21 etc.

For example, to use AMD, 3 vCPU, 4GB RAM shared-cpu x64 instance, you can define the runs-on as follows:

job-name:
   runs-on: [self-hosted, type-cpx21]
Server Location:

The server location can bespecified by using the –default-location option or the in-<name> runner label. By default, location is not set as some server types are not available in some locations.

Image:

The server is configured to have the image specified by the –default-image option or the image-<name> runner label.

SSH Access:

The server is configured to be accessed using ssh utility and the SSH public key path is specified using the –ssh-key option.

Image Configuration:

Each new server instance is configured using the setup and the startup scripts.

The Setup Script

The setup script creates and configures runner user that has sudo privileges.

Setup:
set -x

echo "Create and configure runner user"

adduser runner --disabled-password --gecos ""
echo "%wheel   ALL=(ALL:ALL) NOPASSWD:ALL" >> /etc/sudoers
addgroup wheel
usermod -aG wheel runner
usermod -aG sudo runner

The Start-up Script

The startup script installs GitHub Actions runner. After installation it configures the runner to start in an –ephemeral mode. The –ephemeral mode causes the runner to exit as soon as it completes a job. After the runner exits the server is powered off.

The x64 startup script installs and configures x64 version of the runner.

x64:
set -x
echo "Install runner"
cd /home/runner
curl -o actions-runner-linux-x64-2.306.0.tar.gz -L https://github.com/actions/runner/releases/download/v2.306.0/actions-runner-linux-x64-2.306.0.tar.gz
echo "b0a090336f0d0a439dac7505475a1fb822f61bbb36420c7b3b3fe6b1bdc4dbaa  actions-runner-linux-x64-2.306.0.tar.gz" | shasum -a 256 -c
tar xzf ./actions-runner-linux-x64-2.306.0.tar.gz

echo "Configure runner"
./config.sh --unattended --replace --url https://github.com/${GITHUB_REPOSITORY} --token ${GITHUB_RUNNER_TOKEN} --name "$(hostname)" --runnergroup "${GITHUB_RUNNER_GROUP}" --labels "${GITHUB_RUNNER_LABELS}" --work _work --ephemeral

echo "Start runner"
bash -c "screen -d -m bash -c './run.sh; sudo poweroff'"

The ARM64 startup script is similar to the x64 script but install an ARM64 version of the runner.

ARM64:
set -x
echo "Install runner"
cd /home/runner

curl -o actions-runner-linux-arm64-2.306.0.tar.gz -L https://github.com/actions/runner/releases/download/v2.306.0/actions-runner-linux-arm64-2.306.0.tar.gz# Optional: Validate the hash
echo "842a9046af8439aa9bcabfe096aacd998fc3af82b9afe2434ddd77b96f872a83  actions-runner-linux-arm64-2.306.0.tar.gz" | shasum -a 256 -c# Extract the installer
tar xzf ./actions-runner-linux-arm64-2.306.0.tar.gz

echo "Configure runner"
./config.sh --unattended --replace --url https://github.com/${GITHUB_REPOSITORY} --token ${GITHUB_RUNNER_TOKEN} --name "$(hostname)" --runnergroup "${GITHUB_RUNNER_GROUP}" --labels "${GITHUB_RUNNER_LABELS}" --work _work --ephemeral

echo "Start runner"
bash -c "screen -d -m bash -c './run.sh; sudo poweroff'"

Scaling Down Runners

Powered Off Servers

The program scales down runners by first cleaning up powered off servers. The scale down service relies on the fact that the startup script starts an ephemeral runner which will pick up only 1 job and then will power itself off after the job is complete.

The powered off servers are deleted after the max-powered-off-time interval which can be specified using the –max-powered-off-time option which by default is set to 20 sec.

Idle Runners

The scale down service also monitors all the runners that have idle status and tries to delete any servers associated with such runners if the runner is idle for more than the max-idle-runner-time period. This is needed in case a runner never gets a job assigned to it and the server will stay in the power on state. This cycle relies on the fact that the runner’s name is the same as server’s name. The max-idle-runner-time can be specified using the –max-idle-runner-time option which by default is set to 120 sec.

Zombie Servers

The scale down service will delete any zombie servers. A zombie server is defined as as any server that fails to register its runner within the max-runner-registration-time. The max-runner-registration-time can be specified using the –max-runner-registration-time option which by default is set to 60 sec.

Handling Failing Conditions

The program is designed to handle the following failing conditions:

Server Never Registers a Runner:

The server will remain in running state and should be reclaimed by the scale down service when it checks the actual runners registered for current servers. If it finds a server that is running but no runner is active for it it will be deleted after the max-runner-registration-time period.

The ./config.sh Command Fails:

The behavior will be the same as for the Server Never Registers a Runner case above.

The ./run.sh Command Fails:

The server will be powered off by the startup script and will be deleted by the scale down service.

Creating Server For Queued Job Fails:

If creation of the server fails for some reason then the scale up service will retry the operation in the next interval as the job’s status will remain queued.

Runner Never Gets a Job Assigned:

If the runner never gets a job assigned, then the scale down service will remove the runner and delete its server after the max-idle-runner-time period.

Runner Created With a Mismatched Labels:

The behavior will be the same as for the Runner Never Gets a Job Assigned case above.

Program Options

The following options are supported:

  • -h, –help show this help message and exit

  • -v, –version show program’s version number and exit

  • –license show program’s license and exit

  • –github-token GITHUB_TOKEN GitHub token, default: $GITHUB_TOKEN environment variable

  • –github-repository GITHUB_REPOSITORY GitHub repository, default: $GITHUB_REPOSITORY environment variable

  • –hetzner-token HETZNER_TOKEN Hetzner Cloud token, default: $HETZNER_TOKEN environment variable

  • –ssh-key path public SSH key file, default: ~/.ssh/id_rsa.pub

  • –default-type default runner server type name, default: cx11

  • –default-location default runner server location name, default: not specified

  • –default-image default runner server image name, default: ubuntu-22.04

  • -m count, –max-runners count maximum number of active runners, default: unlimited

  • -w count, –workers count number of concurrent workers, default: 10

  • –logger-config path custom logger configuration file

  • –setup-script path path to custom server setup script

  • –startup-x64-script path path to custom server startup script

  • –startup-arm64-script path path to custom ARM64 server startup script

  • –max-powered-off-time sec maximum time after which a powered off server is deleted, default: 20 sec

  • –max-idle-runner-time sec maximum time after which an idle runner is removed and its server deleted, default: 120 sec

  • –max-runner-registration-time maximum time after which the server will be deleted if its runner is not registered with GitHub, default: 60 sec

  • –scale-up-interval sec scale up service interval, default: 10 sec

  • –scale-down-interval sec scale down service interval, default: 10 sec

  • –debug enable debugging mode, default: False

  • commands:

    • command

      • cloud cloud service commands

        • -n server, –name server deployment server name, default: github-runners

        • deploy deploy cloud service

          • -f, –force force deployment if already exist

          • -l name, –location name deployment server location, default: ash

          • -t name, –type name deployment server type, default: cpx11

          • -i name, –image name deployment server image, default: ubuntu-22.04

        • logs get cloud service logs

          • -f, –follow follow logs journal, default: False

        • status get cloud service status

        • start start cloud service

        • stop stop cloud service

        • install install cloud service

          • -f, –force force installation if service already exists

        • uninstall uninstall cloud service

        • upgrade upgrade cloud service

          • –version version package version, default: the latest

      • service service commands

        • install install service

          • -f, –force force installation if service already exists

        • uninstall uninstall service

        • status get service status

        • logs get service logs

          • -f, –follow follow logs journal, default: False

        • start start service

        • stop stop service

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page