Skip to main content

Autoscaling GitHub Actions Runners Using Hetzner Cloud

Project description

test bug

Autoscaling GitHub Actions Runners Using Hetzner Cloud

The github-runners service program starts and monitors queued up jobs for GitHub Actions workflows. When a new job is queued up, it creates a new Hetzner Cloud server instance that provides an ephemeral GitHub Actions runner. Each server instance is automatically powered off when job completes and then powered off servers are automatically deleted. Both x64 and arm64 runners are supported.

❗Warning:

This program is provided on “AS IS” basis without warranties or conditions of any kind. See LICENSE. Use it at your own risk. Manual monitoring is required to make sure server instances are cleaned up properly and costs are kept under control.

Costs depend on the server type, number of jobs and execution time. For each job a new server instance is created to avoid any cleanup. Server instances are not shared between any jobs.

✋ Note:

Currently Hetzner Cloud server instances are billed on hourly basis. So a job that takes 1 min will be billed the same way as for a job that takes 59 minutes. Therefore, the minimal cost for any job is the cost of the server for 1 hour plus the cost for one public IPv4 address.

Features

  • cost efficient on-demand runners using Hetzner Cloud

  • supports both x64 and ARM64 runners

  • supports specifying custom runner types using job labels

  • simple configuration

Installation

pip3 install testflows.github.runners

Quick Start

Set environment variables corresponding to your GitHub repository and Hetzner Cloud project

export GITHUB_TOKEN=ghp_...
export GITHUB_REPOSITORY=vzakaznikov/github-runners
export HETZNER_TOKEN=GJzdc...
export HETZNER_SSH_KEY_NAME=user@user-node

and then start github-runners program

github-runners
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Logging in to Hetzner Cloud
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Logging in to GitHub
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Getting repository vzakaznikov/github-runners
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Creating scale up service
07/22/2023 08:20:37 PM   INFO MainThread            main 🍀 Creating scale down service
07/22/2023 08:20:38 PM   INFO   worker_2   create_server 🍀 Create server
...

or you can pass the required options inline as follows:

github-runners --github-token <GITHUB_TOKEN> --github-repository <GITHUB_REPOSITORY> --hetzner-token <HETZNER_TOKEN> --hetzner-ssh-key <HEZNER_SSH_KEY>

Installation From Sources

For development, you can install from sources as follows:

git clone https://github.com/testflows/Github-Runners.git
./package && ./install

Basic Configuration

By default, the program uses the following environment variables:

  • GITHUB_TOKEN

  • GITHUB_REPOSITORY

  • HETZNER_TOKEN

  • HETZNER_SSH_KEY

or you can specify these values using the following options:

  • –github-token

  • –github-repository

  • –hetzner-token

  • –hetzner-ssh-key

Running as a Service

You can run github-runners as a service.

✋ Note:

In order to install the service, the user that installed the module must have sudo privileges.

Installing and Uninstalling

After installation, you can use service install and service uninstall commands to install and uninstall the service.

✋ Note:

The options that are passed to the github-runners <options> service install command will be the same options with which the service will be executed.

export GITHUB_TOKEN=ghp_...
export GITHUB_REPOSITORY=testflows/github-runners
export HETZNER_TOKEN=GJzdc...
export HETZNER_SSH_KEY_NAME=user@user-node

github-runners service install

The /etc/systemd/system/github-runners.service file is created with the following content.

✋ Note:

The service will use the User and the Group of the user executing the program.

/etc/systemd/system/github-runners.service:
[Unit]
Description=Autoscaling GitHub Actions Runners
After=multi-user.target
[Service]
User=1000
Group=1000
Type=simple
Restart=always
Environment=GITHUB_TOKEN=ghp_...
Environment=GITHUB_REPOSITORY=testflows/github-runners
Environment=HETZNER_TOKEN=GJ..
Environment=HETZNER_SSH_KEY=user@user-node
Environment=HETZNER_IMAGE=ubuntu-22.04
ExecStart=/home/user/.local/lib/python3.10/site-packages/testflows/github/runners/bin/github-runners --workers 10 --max-powered-off-time 20 --max-idle-runner-time 120 --max-runner-registration-time 60 --scale-up-interval 10 --scale-down-interval 10
[Install]
WantedBy=multi-user.target

Modifying Program Options

If you want to modify service program options you can stop the service, edit the /etc/systemd/system/github-runners.service file by hand, then reload service daemon, and start the service back up.

github-runners service stop
sudo vim /etc/systemd/system/github-runners.service
sudo systemctl daemon-reload
github-runners service start
github-runners service uninstall

Checking Status

After installation, you can check the status of the service using the service status command.

github-runners service status:
service status:
● github-runners.service - Autoscaling GitHub Actions Runners
     Loaded: loaded (/etc/systemd/system/github-runners.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2023-07-24 14:38:33 EDT; 1h 31min ago
   Main PID: 66188 (python3)
      Tasks: 3 (limit: 37566)
     Memory: 28.8M
        CPU: 8.274s
     CGroup: /system.slice/github-runners.service
             └─66188 python3 /usr/local/bin/github-runners --workers 10 --max-powered-off-time 20 --max-idle-runner-time 120 --max->

Jul 24 14:38:33 user-node systemd[1]: Started Autoscaling GitHub Actions Runners.
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Logging in to Hetzner >
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Logging in to GitHub
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Getting repository vza>
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Creating scale up serv>
Jul 24 14:38:33 user-node github-runners[66188]: 07/24/2023 02:38:33 PM   INFO MainThread            main 🍀 Creating scale down se>
lines 1-16/16 (END)

Manual Start and Stop

You can start and stop the service using the service start and service stop commands as follows:

github-runners service start
github-runners service stop

or using service system utility

sudo service github-runners start
sudo service github-runners stop

Checking Logs

You can get the logs for the service using the service logs command.

Use -f, –follow option to follow logs journal.

github-runners service logs -f
followed service log:
sudo github-runners service logs
Jul 24 16:12:14 user-node systemd[1]: Stopping Autoscaling GitHub Actions Runners...
Jul 24 16:12:14 user-node systemd[1]: github-runners.service: Deactivated successfully.
Jul 24 16:12:14 user-node systemd[1]: Stopped Autoscaling GitHub Actions Runners.
Jul 24 16:12:14 user-node systemd[1]: github-runners.service: Consumed 8.454s CPU time.
Jul 24 16:12:17 user-node systemd[1]: Started Autoscaling GitHub Actions Runners.
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Logging in to Hetzner Cloud
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Logging in to GitHub
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Getting repository vzakaznikov/github-runners
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Creating scale up service
Jul 24 16:12:18 user-node github-runners[74176]: 07/24/2023 04:12:18 PM   INFO MainThread            main 🍀 Creating scale down service

which is equivalent to the following journalctl command:

journalctl -u github-runners.service -f

You can dump the full log by omitting the -f, –follow option.

github-runners service logs
full service log:
Jul 24 14:24:42 user-node systemd[1]: Started Autoscaling GitHub Actions Runners.
Jul 24 14:24:42 user-node env[62771]: LANG=en_CA.UTF-8
Jul 24 14:24:42 user-node env[62771]: LANGUAGE=en_CA:en
Jul 24 14:24:42 user-node env[62771]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
Jul 24 14:24:42 user-node env[62771]: INVOCATION_ID=dc7b778f95fa4ccf95e4a4592b50d9e1
Jul 24 14:24:42 user-node env[62771]: JOURNAL_STREAM=8:328542
Jul 24 14:24:42 user-node env[62771]: SYSTEMD_EXEC_PID=62771
...

Deploying Application

You can deploy github-runners as a service to a new Hetzner Cloud server instance using the deploy command.

✋ Note:

The options that are passed to the github-runners <options> deploy command will be the same options with which the service will be executed.

export GITHUB_TOKEN=ghp_...
export GITHUB_REPOSITORY=testflows/github-runners
export HETZNER_TOKEN=GJzdc...
export HETZNER_SSH_KEY_NAME=user@user-node

github-runners deploy

The deploy command will use the following default values:

location:

ash

type:

cpx11

image:

ubuntu-22.04

You can customize deployment server location, type, and image using the –location, –type, and –image options.

github-runners deploy --location nbg1 --type cx11 --image ubuntu-22.04

Scaling Up Runners

The program scale up runners by looking for any jobs that have queued status. For each such job, a corresponding Hetzner Cloud server instance is created with the following name:

github-runner-{job.run_id}

The server is configured using default setup and startup scripts. The runner name is set to be the same as the server name so that servers can deleted for any idle runner that for some reason does not pick up a job for which it was created within the max-idle-runner-time period.

Note:

Given that the server name is fixed and specific for each job.run_id, if multiple github-runners are running in parallel then only 1 server will be created for a given job and any other attempts to create a server with the same name will be rejected by the Hetzner Cloud.

Also,

Note:

There is no guarantee that a given runner will pick the the job with the exact run_id that caused it to be created. This is expected and because for each queued job a unique runner will be created the number of runners will be equal the number of jobs and therefore under normal conditions all jobs will executed as expected.

Maximum Number of Runners

By default, the maximum number of runners and therefore the maximum number if server instances is not set and therefore is unlimited. You can set the maximum number of runners using the –max-runners option.

New Server

The new server is accessed using SSH. It boots up with the specified OS image and is configured using the setup and startup scripts.

Server Type:

The default server type is cx11. However, a job server-{hetzner-server-type} label can be used to specify custom server type. Where the {hetzner-server-type} must be a valid Hetzner Cloud server type name such as cx11, cpx21 etc.

For example,

runs-on: [self-hosted, server-cpx21]
SSH Access:

The server is configured to be accessed using ssh utility and the SSH key specified by name either using the –hetzner-ssh-key option or the HETZNER_SSH_KEY environment variable.

OS Image:

The server is configured to have the OS image specified by the –hetzner-image option or the HETZNER_IMAGE environment variable.

Image Configuration:

Each new server instance is configured using setup and startup scripts.

The Setup Script

The setup script created and configures runner user that has sudo privileges.

Setup:
set -x

echo "Create and configure runner user"

adduser runner --disabled-password --gecos ""
echo "%wheel   ALL=(ALL:ALL) NOPASSWD:ALL" >> /etc/sudoers
addgroup wheel
usermod -aG wheel runner
usermod -aG sudo runner

The Start-up Script

The startup script installs GitHub Actions runner. After installation it configures the runner to start in an –ephemeral mode. The –ephemeral mode causes the runner to exit as soon as it completes a job. After the runner exits the server is powered off.

The x64 startup script installs and configures x64 version of the runner.

x64:
set -x
echo "Install runner"
cd /home/runner
curl -o actions-runner-linux-x64-2.306.0.tar.gz -L https://github.com/actions/runner/releases/download/v2.306.0/actions-runner-linux-x64-2.306.0.tar.gz
echo "b0a090336f0d0a439dac7505475a1fb822f61bbb36420c7b3b3fe6b1bdc4dbaa  actions-runner-linux-x64-2.306.0.tar.gz" | shasum -a 256 -c
tar xzf ./actions-runner-linux-x64-2.306.0.tar.gz

echo "Configure runner"
./config.sh --unattended --replace --url https://github.com/${GITHUB_REPOSITORY} --token ${GITHUB_RUNNER_TOKEN} --name "$(hostname)" --runnergroup "${GITHUB_RUNNER_GROUP}" --labels "${GITHUB_RUNNER_LABELS}" --work _work --ephemeral

echo "Start runner"
bash -c "screen -d -m bash -c './run.sh; sudo poweroff'"

The ARM64 startup script is similar to the x64 script but install an ARM64 version of the runner.

ARM64:
set -x
echo "Install runner"
cd /home/runner

curl -o actions-runner-linux-arm64-2.306.0.tar.gz -L https://github.com/actions/runner/releases/download/v2.306.0/actions-runner-linux-arm64-2.306.0.tar.gz# Optional: Validate the hash
echo "842a9046af8439aa9bcabfe096aacd998fc3af82b9afe2434ddd77b96f872a83  actions-runner-linux-arm64-2.306.0.tar.gz" | shasum -a 256 -c# Extract the installer
tar xzf ./actions-runner-linux-arm64-2.306.0.tar.gz

echo "Configure runner"
./config.sh --unattended --replace --url https://github.com/${GITHUB_REPOSITORY} --token ${GITHUB_RUNNER_TOKEN} --name "$(hostname)" --runnergroup "${GITHUB_RUNNER_GROUP}" --labels "${GITHUB_RUNNER_LABELS}" --work _work --ephemeral

echo "Start runner"
bash -c "screen -d -m bash -c './run.sh; sudo poweroff'"

Scaling Down Runners

Powered Off Servers

The program scales down runners by first cleaning up powered off servers. The scale down service relies on the fact that the startup script starts an ephemeral runner which will pick up only 1 job and then will power itself off after the job is complete.

The powered off servers are deleted after the max-powered-off-time interval which can be specified using the –max-powered-off-time option which by default is set to 20 sec.

Idle Runners

The scale down service also monitors all the runners that have idle status and tries to delete any servers associated with such runners if the runner is idle for more than the max-idle-runner-time period. This is needed in case a runner never gets a job assigned to it and the server will stay in the power on state. This cycle relies on the fact that the runner’s name is the same as server’s name. The max-idle-runner-time can be specified using the –max-idle-runner-time option which by default is set to 120 sec.

Zombie Servers

The scale down service will delete any zombie servers. A zombie server is defined as as any server that fails to register its runner within the max-runner-registration-time. The max-runner-registration-time can be specified using the –max-runner-registration-time option which by default is set to 60 sec.

Handling Failing Conditions

The program is designed to handle the following failing conditions:

Server Never Registers a Runner:

The server will remain in running state and should be reclaimed by the scale down service when it checks the actual runners registered for current servers. If it finds a server that is running but no runner is active for it it will be deleted after the max-runner-registration-time period.

The ./config.sh Command Fails:

The behavior will be the same as for the Server Never Registers a Runner case above.

The ./run.sh Command Fails:

The server will be powered off by the startup script and will be deleted by the scale down service.

Creating Server For Queued Job Fails:

If creation of the server fails for some reason then the scale up service will retry the operation in the next interval as the job’s status will remain queued.

Runner Never Gets a Job Assigned:

If the runner never gets a job assigned, then the scale down service will remove the runner and delete its server after the max-idle-runner-time period.

Runner Created With a Mismatched Labels:

The behavior will be the same as for the Runner Never Gets a Job Assigned case above.

Program Options

The following options are supported:

  • -h, –help show this help message and exit

  • -v, –version show program’s version number and exit

  • –license show program’s license and exit

  • –github-token GITHUB_TOKEN GitHub token, default: $GITHUB_TOKEN environment variable

  • –github-repository GITHUB_REPOSITORY GitHub repository, default: $GITHUB_REPOSITORY environment variable

  • –hetzner-token HETZNER_TOKEN Hetzner Cloud token, default: $HETZNER_TOKEN environment variable

  • –ssh-key HETZNER_SSH_KEY Hetzner Cloud SSH key name, default: $HETZNER_SSH_KEY environment variable

  • –image HETZNER_IMAGE Hetzner Cloud server image name, default: ubuntu-22.04

  • -m count, –max-runners count maximum number of active runners, default: unlimited

  • -w count, –workers count number of concurrent workers, default: 10

  • –logger-config path custom logger configuration file

  • –setup-script path path to custom server setup script

  • –startup-x64-script path path to custom server startup script

  • –startup-arm64-script path path to custom ARM64 server startup script

  • –max-powered-off-time sec maximum time after which a powered off server is deleted, default: 20 sec

  • –max-idle-runner-time sec maximum time after which an idle runner is removed and its server deleted, default: 120 sec

  • –max-runner-registration-time maximum time after which the server will be deleted if its runner is not registered with GitHub, default: 60 sec

  • –scale-up-interval sec scale up service interval, default: 10 sec

  • –scale-down-interval sec scale down service interval, default: 10 sec

  • –debug enable debugging mode, default: False

  • commands:

    • command

      • deploy deploy application

        • -n, –name deployment server name, default: github-runners

        • -f, –force force deployment if already exist

        • -l name, –location name deployment server location, default: ash

        • -t name, –type name deployment server type, default: cpx11

        • -i name, –image name deployment server image, default: ubuntu-22.04

      • service service commands

        • install install service

          • -f, –force force installation if service already exists

        • uninstall uninstall service

        • status get service status

        • logs get service logs

          • -f, –follow follow logs journal, default: False

        • start start service

        • stop stop service

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page