GCP Provider Layer Plugin for the Virtual Test Development System (vTDS) suite
Project description
vtds-provider-gcp
The GCP provider layer implementation for vTDS allowing a vTDS cluster to be built as a GCP project.
Description
This repo provides the code and a base configuration to deploy a vTDS cluster in a Google Cloud Platform (GCP) project within an existing Google organization. It is intended as the GCP Provider Layer for vTDS which is a provider and product neutral framework for building virtual clusters to test and develop software. The Provider Layer defines the configuration structure and software implementation required to establish the lowest level resources needed for a vTDS cluster on a given host provider, in this case GCP.
Each Provider Layer implementation contains provider specific code and a fully defined base configuration capable of deploying the provider resources of the cluster. The base configuration of the GCP Provider Layer implementation, defines the default settings for resources needed to construct a vTDS platform consisting of Ubuntu based linux GCP instances (Virtual Blades) connected by a GCP provided network (Blade Interconnect) within a single VPC in a single GCP region. The Blade Interconnect and Virtual Blade configurations are provided as templates or base-classes on which other configurations can be built. Each GCP instance (Virtual Blade) is configured to permit nested virtualization and with enough CPU and memory to host (at least) a single nested virtual machine. The assignment of virtual machines (Virtual Nodes) and Virtual Networks to these blade and interconnect resources as well as the configuration of Virtual Blades at the OS level are handled in higher layers of the vTDS stack.
NOTE: while the base configuration contains examples of every configuration setting and its default value in a given context, this config is not sufficient to deploy a Provider Layer for an actual vTDS system. Three things are needed to complete a working configuration:
- The GCP Organization configuration of the system
- A Blade Interconnect configuration that is not a 'pure_base_class'
- A Virtual Blade configuration with at least one instance specified that is not a 'pure_base_class'
The GCP Organization overlay provides information specific to your GCP Organization. There is more information on this in the Getting Started Guide section of this README.
Canned configuration overlays for all layers of vTDS that are appropriate for various different applications can be found in the vtds-configs GitHub repository. Canned configuration overlays that offer GCP Provider Layer specific configuration of Blade Interconnects and Virtual Blades (among other things) are available in the layers/provider/gcp sub-directory of that repository.
An overview of vTDS is available in the vTDS Core Repository.
Getting Started with the GCP Provider Implementation
GCP Resources, Roles and Tools
As its name suggests, the GCP Provider Layer Implementation uses Google Cloud Platform (GCP) to implement a vTDS Provider Layer. To be able to use GCP, the user must have access to the resources of a GCP Organization, must be assigned a set of roles related to those resources and must have installed the necessary GCP related tools on their local system. Much of this is administrative preparation that the user does not control. In order to make it possible to set up, though, it is described here.
GCP Organization and Administrative Setup
The GCP Provider Layer requires you to have access to GCP through a
GCP
organization.
You will need to arrange to create one, which will also involve
setting up Google Cloud
Identity
or
Google Workspace
if you don't already have one. As part of setting that up, a billing
account will be created and associated with your organization. The
billing account will have a name, which name can be anything, but for
this guide, we will name it gcp-billing.
The administrator of your organization must also create a folder for
vTDS projects within your organization. They may name the folder
anything they like, but for the sake of this guide, we will use the
name vtds-systems.
Within the vtds-systems folder, your administrator must create a
'seed project' for vTDS deployments. The seed project is a GCP project
that has no compute instances and serves as a persistent well known
place to store vTDS system state using Google Cloud Storage. This
project may also be named anything, but for this guide we will use
vtds-seed.
Finally, your administrator should set up a Google Group within your
organization. This group will permit its members to obtain the
permissions needed to create, destroy and use vTDS systems. This group
can be named anything, but for this guide we will use
vtds-users, which, when fully qualified will be
vtds-users@myorganization.net
if your organization's domain name is myorgaization.net. This group
needs the following access roles:
-
On the
gcp-billingbilling account, thevtds-usersgroup needs to be a pricipal with theBilling Userrole. -
At the GCP Organization level the
vtds-usersgroup needs theviewerrole. -
On the
vtds-systemsfolder thevtds-usersgroup needs the following roles:-
Project Creator
-
Project Deleter
-
Project IAM Admin
-
Project Billing Manager
-
-
On the
vtds-seedproject thevtds-usersgroup needs thestorage-adminrole.
GCP User Requirements and SDK Installation
As a vTDS user, you will need an account within your organization that
is a member of the vtds-users group.
As a vTDS user you need to have the Google Cloud SDK installed on your local system.
As a vTDS user you will need to be logged into your GCP account both
as an SDK user and as an application user (portions of the vTDS code
have to use the gcloud command instead of GCP client libraries,
which forces vTDS to require both). To do this, run the following two
commands on your local system:
gcloud auth login
and
gcloud auth application-default login
These will (typically) pop up a browser and let you log into your
account and authorize access. The first authorizes SDK (gcloud
command) access. The second authorizes application client library (in
this case, primarily terraform) access.
Terraform and Terragrunt Preparation
The vTDS GCP Provider implementation uses
Terragrunt and
Terraform
to construct the GCP project that will be used for a vTDS cluster. The
layer code manages the versions of Terraform and Terragrunt using the
Terraform Version Manager (tfenv) and the Terragrunt Version Manager
(tgenv). You will need to install both of these before using the GCP
Provider Implementation.
Installation of the Terraform Version Manager is explained here.
Installation of Terragrunt Version Manager is explained here.
Using the GCP Provider Layer Implementation
To use the GCP Provider Layer Implementation in your vTDS stack, edit
the core configuration you are using to deploy your vTDS system and
configure the Provider Layer to pull in vtds-provider-gcp. The GCP
Provider Layer Implementation is available as a stream of stable
releases from PyPI or in source form from GitHub. When pulling from
PyPI the version can be null, in which case the latest version will be
used, or it can specify any of the published stable versions. When
pulling from GitHub the version can be null, in which case the main
branch will be used, or set to a tag, branch or digest indicating a
git version.
Pulling from PyPI
Here is the form of the configuration for pulling the GCP Provider Layer Implementation from PyPI:
provider:
package: vtds-provider-gcp
module: vtds_provider_gcp
source_type: pypi
metadata:
version: null
Pulling from GitHub
Here is the form of the configuration for pulling the GCP Provider Layer Implementation from GitHub:
provider:
package: vtds-provider-gcp
module: vtds_provider_gcp
source_type: git
metadata:
url: "git@github.com:Cray-HPE/vtds-provider-gcp.git"
version: null
Generally speaking, there will be a canned core configuration for your vTDS application available in the core configurations provided by vtds-configs that will already be set up to pull in the GCP Provider Layer Implementation, so you should be able to simply copy and modify that. Instructions for setting up to deploy your vTDS system can be found in the vTDS Core Getting Started guide.
Using an Organization Config Overlay
The canned core configurations generally split the Provider Layer
configuration into two separate overlays. One that provides the
desired application specific configuration of the layer and another
that provides information about the organization hosting the vTDS
system. By decoupling organization information, these two
configuration overlays allow multiple core configurations to share the
same organization config for different applications, and multiple
organizations to share the same application specific configuration
overlay without conflict. This approach also allows an organization to
host its organization configuration separately from the canned
configurations. You will need to create an organization configuration
overlay and make it available somewhere. You have the choice of simply
adding the necessary content to your core configuration, making a
separate file and using it locally through command line options to the
vtds commands, hosting the file at a simple URL of your choosing, or
hosting the file in a GitHub or private remote Git repository. In any
case, your organization configuration should be based on this
annotated example Organization configuration
overlay.
Once you have the Organization configuration overlay prepared and hosted, assuming you are not putting it in the core configuration or in a local file, modify your core configuration file to pull in the Organization configuration overlay.
Known Failures and Workarounds
GCP Credentials Expire During or Before an Operation
When deploying or removing a GCP project for vTDS manually from the command line, you are reliant on the user and application-default GCP credentials. These expire at a fixed time after you log into GCP (for example, after 24 hours). There is currently no way to preemptively re-login and extend that deadline, which means that occasionally you will try to deploy or remove a vTDS and your credentials will expire either before (ideally) or in the middle of the operation.
This problem does not occur when service accounts are used, but interactive users are discouraged from using service accounts, since they need to be protected to avoid abuse of GCP.
The workaround for this when working interactively is to re-login to GCP if you suspect the failure is credential related and then re-run the operation. For the most part these operations can be restarted safely and will then run to completion cleanly.
If you are trying to deploy a vTDS and it will not move forward after a re-login, you may need to remove it first and try again. If you are trying to remove a vTDS and can't move forward, read the next paragraph.
Occasionally, the operation will fail in a sensitive part of the operation and leave persistent data in an inconsistent state. If after you re-login, your operation continues to fail, see "Persistent Cached Provider Data Becomes Inconsistent" below.
Persistent Cached Provider Data Becomes Inconsistent
If the persistent cached data about your vTDS becomes inconsistent, you will neither be able to deploy nor remove your vTDS. This can happen if your deployment or removal was interrupted or failed during a sensitive action. The symptom is that you know you are properly logged into GCP, you have given any recently removed instance of your vTDS time to be cleaned up by GCP (see "Deployment Fails When Run Too Quickly After Removal of the Same vTDS") yet you can neither deploy nor remove your vTDS.
There are three steps to correcting this situation:
- Remove the Terraform data from your local build tree
- Remove the cached Terraform data bucket from your seed project's storage
- Manually remove the GCP project containing your vTDS (if it is present)
Remove Local Terraform Data
The local terraform data is located in your vTDS build tree at
vtds-build/provider/terragrunt under your vTDS cluster
directory. While in your cluster directory run the following:
rm -rf vtds_build/provider/terragrunt
to remove the data.
Remove Cached Terraform Data
You also need to remove the Google Storage bucket containing your
cached Terraform state. You can do this using the gsutil
command. First you want to find the URL for the bucket you are looking
for. In general, the form of this URL is:
gs://<vTDS Organization Name>-<vTDS Project Base Name>-tf-state/
You can list buckets available to you by running:
gsutil ls
Look for the -tf-state bucket corresponding to your vTDS system. For
example, if your organization is hpe and your vTDS base name is
openchami, you would be looking for the bucket
hpe-openchami-tf-state. Using the URL found this way, remove the
bucket using a comand of the form:
gsutil -m rm -r gs://hpe-openchami-tf-state/
The -m option here speeds up the removal considerably and the -r
just tells gsutil to recursively remove the bucket.
Manually Remove the GCP Project
Finally, you need to remove the project. Since there is no Terraform
state remaining, you cannot do this using vTDS so you have to do it
manually. This is done by finding the project ID of your vTDS
system. Let's say we are continuing with the openchami project in
the hpe organization, the project ID will be
hpe-openchami-<suffix> where <suffix> is a short random
hexadecimal string. You can find the project using the gcloud
command using a command similar to (output shown):
$ gcloud projects list | grep hpe-openchami
hpe-openchami-a608 hpe-openchami 479454303572
The first string in the output here, hpe-openchami-a608, and is the
identifier you will use to remove the project. You can do that with a
comand in the following form:
gcloud projects delete hpe-openchami-a608
You will be prompted to confirm the removal. Once the project is removed, you are ready to try deploying it again.
Deployment Fails When Run Too Quickly After Removal of the Same vTDS
When vTDS removes a system, it can take a few minutes for GCP to catch up with the fact that the system is removed. During that time, pieces of the GCP project are being torn down, and the project's ID still exists. If you try to deploy the same vTDS system again too quickly, the attempt will fail. The solution to this is to wait about 5 minutes and try the deploy operation again.
Terragrunt and Terraform Initial Installs Fail
In order to be able to work with the tgenv and tfenv commands
within the vTDS code and use the configured version, at least one
version of terragrunt and terraform respectively need to be
installed on the local system. There is code to ensure that this is
true in this layer implementation. It normally tries to install the
latest version of both products.
Unfortunately, because of the way releases work for both terragrunt
and terraform, occasionally the installation repositories get confused
and the latest version is temporarily (sometimes for an extended
period) unavailable. This will cause the GCP provider layer to fail
indicating that the requested version (usually latest, unless you have
changed it in your configuration) is not available.
To work around this problem when it occurs, first, identify an available version of the offending product(s), then edit your core configuration, and add as much of the following as you need to get vTDS to work again:
provider:
terragrunt:
terraform_dummy_version: "<available-terraform-version>"
terragrunt_dummy_version: "<available-terragrunt-version>"
You may merge this in with any pre-existing provider configuration
you find there if you like, or let it stand by itself. The dummy
version controls the initially installed version, not the version
actually used for vTDS operations. There is a separate version setting
that tells the GCP layer wat versions to use.
NOTE: while these settings are in your core configuration, you have pinned the initial version(s) of the tool(s). This is harmless for the short term, but the versions you set will, eventually, become stale and you may see failures because the version(s) you set are unavailable. It is a good idea to remove these settings once the workaround is no longer needed.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vtds_provider_gcp-0.0.29.tar.gz.
File metadata
- Download URL: vtds_provider_gcp-0.0.29.tar.gz
- Upload date:
- Size: 64.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d34f31f87feb32a0985a0a4b07b825b6a43ccca187c4595aa3fc02a69c7aa776
|
|
| MD5 |
4f2700a36c70680ff5cd8f2f2c59cd23
|
|
| BLAKE2b-256 |
fa585baa4fbbef1d5c49c46240a7d725168ee70e81c4170586b6f125eeb88bb7
|
Provenance
The following attestation bundles were made for vtds_provider_gcp-0.0.29.tar.gz:
Publisher:
build.yml on Cray-HPE/vtds-provider-gcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vtds_provider_gcp-0.0.29.tar.gz -
Subject digest:
d34f31f87feb32a0985a0a4b07b825b6a43ccca187c4595aa3fc02a69c7aa776 - Sigstore transparency entry: 364221457
- Sigstore integration time:
-
Permalink:
Cray-HPE/vtds-provider-gcp@a77679b0e255f528f26c8b97cf6a08b48e39dbe9 -
Branch / Tag:
refs/tags/v0.0.29 - Owner: https://github.com/Cray-HPE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@a77679b0e255f528f26c8b97cf6a08b48e39dbe9 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vtds_provider_gcp-0.0.29-py3-none-any.whl.
File metadata
- Download URL: vtds_provider_gcp-0.0.29-py3-none-any.whl
- Upload date:
- Size: 88.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c940402f2dce14254708886a8556481a714ea4d6e1d8081d7a072ff00ede02a5
|
|
| MD5 |
81efe554caa2426a1f6c7a90f1f48227
|
|
| BLAKE2b-256 |
a06eeef86337682c0a4079c69235aa024df4f24d471d1acad8c1e4e8b4ffb7bd
|
Provenance
The following attestation bundles were made for vtds_provider_gcp-0.0.29-py3-none-any.whl:
Publisher:
build.yml on Cray-HPE/vtds-provider-gcp
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vtds_provider_gcp-0.0.29-py3-none-any.whl -
Subject digest:
c940402f2dce14254708886a8556481a714ea4d6e1d8081d7a072ff00ede02a5 - Sigstore transparency entry: 364221477
- Sigstore integration time:
-
Permalink:
Cray-HPE/vtds-provider-gcp@a77679b0e255f528f26c8b97cf6a08b48e39dbe9 -
Branch / Tag:
refs/tags/v0.0.29 - Owner: https://github.com/Cray-HPE
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
build.yml@a77679b0e255f528f26c8b97cf6a08b48e39dbe9 -
Trigger Event:
push
-
Statement type: