Extend the Chaos Toolkit and the Chaos Engineering Platform with capabilities for Microsoft Azure
Project description
Chaos Toolkit Extension for Azure
This project is a collection of actions and probes, gathered as an extension to the Chaos Toolkit. It targets the Microsoft Azure platform.
Install
This package requires Python 3.5+
To be used from your experiment, this package must be installed in the Python environment where chaostoolkit already lives.
$ pip install -U proofdock-chaos-azure
Usage
To use the probes and actions from this package, add the following to your experiment file:
{
"type": "action",
"name": "start-chaos",
"provider": {
"type": "python",
"module": "pdchaosazure.vm.actions",
"func": "stop_machines",
"secrets": ["azure"],
"config": ["azure_subscription_id"]
}
}
That's it!
Please explore the code to see existing probes and actions.
Configuration
This extension uses the Azure SDK libraries under the hood. The Azure SDK library expects that you have a tenant and client identifier, as well as a client secret and subscription, that allows you to authenticate with the Azure resource management API.
Configuration values for the Chaos Toolkit Extension for Azure can come from several sources:
- Experiment file
- Azure credential file
The extension will first try to load the configuration from the experiment file
. If configuration is not provided in the experiment file
, it will try to load it from the Azure credential file
.
Credentials
-
Secrets in the Experiment file
{ "secrets": { "azure": { "client_id": "your-super-secret-client-id", "client_secret": "your-even-more-super-secret-client-secret", "tenant_id": "your-tenant-id" } } }
You can retrieve secretes as well from environment or HashiCorp vault.
If you are not working with Public Global Azure, e.g. China Cloud You can set the cloud environment.
{ "client_id": "your-super-secret-client-id", "client_secret": "your-even-more-super-secret-client-secret", "tenant_id": "your-tenant-id", "azure_cloud": "AZURE_CHINA_CLOUD" }
Available cloud names:
- AZURE_CHINA_CLOUD
- AZURE_GERMAN_CLOUD
- AZURE_PUBLIC_CLOUD
- AZURE_US_GOV_CLOUD
-
Secrets in the Azure credential file
You can retrieve a credentials file with your subscription ID already in place by signing in to Azure using the az login command followed by the az ad sp create-for-rbac command
az login az ad sp create-for-rbac --sdk-auth > credentials.json
credentials.json:
{ "subscriptionId": "<azure_aubscription_id>", "tenantId": "<tenant_id>", "clientId": "<application_id>", "clientSecret": "<application_secret>", "activeDirectoryEndpointUrl": "https://login.microsoftonline.com", "resourceManagerEndpointUrl": "https://management.azure.com/", "activeDirectoryGraphResourceId": "https://graph.windows.net/", "sqlManagementEndpointUrl": "https://management.core.windows.net:8443/", "galleryEndpointUrl": "https://gallery.azure.com/", "managementEndpointUrl": "https://management.core.windows.net/" }
Store the path to the file in an environment variable called AZURE_AUTH_LOCATION and make sure that your experiment does NOT contain
secrets
section.
Subscription
Additionally you need to provide the Azure subscription id.
-
Subscription id in the experiment file
{ "configuration": { "azure_subscription_id": "your-azure-subscription-id" } }
Configuration may be as well retrieved from an environment.
An old, but deprecated way of doing it was as follows, this still works but should not be favoured over the previous approaches as it's not the Chaos Toolkit way to pass structured configurations.
{ "configuration": { "azure": { "subscription_id": "your-azure-subscription-id" } } }
-
Subscription id in the Azure credential file
Credential file described in the previous "Credential" section contains as well subscription id. If AZURE_AUTH_LOCATION is set and subscription id is NOT set in the experiment definition, extension will try to load it from the credential file.
Putting it all together
Here is a full example for an experiment containing secrets and configuration:
{
"version": "1.0.0",
"title": "...",
"description": "...",
"tags": ["azure", "kubernetes", "aks", "node"],
"configuration": {
"azure_subscription_id": "xxx"
},
"secrets": {
"azure": {
"client_id": "xxx",
"client_secret": "xxx",
"tenant_id": "xxx"
}
},
"steady-state-hypothesis": {
"title": "Services are all available and healthy",
"probes": [
{
"type": "probe",
"name": "consumer-service-must-still-respond",
"tolerance": 200,
"provider": {
"type": "http",
"url": "https://some-url/"
}
}
]
},
"method": [
{
"type": "action",
"name": "restart-node-at-random",
"provider": {
"type": "python",
"module": "pdchaosazure.machine.actions",
"func": "restart_machines",
"secrets": ["azure"],
"config": ["azure_subscription_id"]
}
}
],
"rollbacks": []
}
Filter arguments
This extension is making heavy use of the Kusto query language to filter those Azure resources that an experiment is targeting.
The Kusto query language in Azure is a read-only request to process data and return results. The request is stated in plain text, using a data-flow model designed to make the syntax easy to read.
Given that an Azure subscription contains the following Azure resources:
[
{
"name": "machine_1",
"resourceGroup": "my_resource_group",
"type": "Microsoft.Compute/virtualMachines"
},
{
"name": "machine_2",
"resourceGroup": "my_resource_group",
"type": "Microsoft.Compute/virtualMachines"
},
{
"name": "machine_1",
"resourceGroup": "another_resource_group",
"type": "Microsoft.Compute/virtualMachines"
}
]
With a filter you can ultimatively select the Azure resources that shall be attacked. For example:
where resourceGroup=='my_resource_group''
will select those machines for an attack[ { "name": "machine_1", "resourceGroup": "my_resource_group", "type": "Microsoft.Compute/virtualMachines" }, { "name": "machine_2", "resourceGroup": "my_resource_group", "type": "Microsoft.Compute/virtualMachines" } ]
where name=='machine_1''
will select those machines for an attack[ { "name": "machine_1", "resourceGroup": "my_resource_group", "type": "Microsoft.Compute/virtualMachines" }, { "name": "machine_1", "resourceGroup": "another_resource_group", "type": "Microsoft.Compute/virtualMachines" } ]
where name=='machine_1' and resourceGroup='my_resource_group''
will select[ { "name": "machine_1", "resourceGroup": "my_resource_group", "type": "Microsoft.Compute/virtualMachines" } ]
- If you want to randomly select one machine of your resource group you may do the following operation:
where resourceGroup='my_resource_group'' | sample 1
. Thesample
operator is generating randomness to your selection.[ { "name": "<one of your machines in the 'my_resource_group'>", "resourceGroup": "my_resource_group", "type": "Microsoft.Compute/virtualMachines" } ]
- If you omit the filter entirely one machine out your subscription (if any) is taken.
Kusto Query Language Light
At some places in the chaos experiment API some Azure resources are not supported by Azure to be filtered with the Kusto Query Language. A very prominent example are instances of a virtual machine set such as Virtual Machine Scale Sets.
We anyhow decided to support you with an easy way of filtering for those resources as well with a Kusto Query Language Light (KQLL) syntax. The KQLL defines a small subset of the KQL from Azure but should serve the daily purposes of the chaos experiments.
The small subset defines:
where
-clauses withand
andor
expressions- pipe
|
operators take
,top
, andsample
commands- Equality operators such as
==
,>=
,<=
,>
, and<
- If you omit the KQLL filter one resource of the cluster is selected at random.
- Those queries that provide the KQLL syntax will be marked as such in the activity's documentation.
Contribute
If you wish to contribute more functions to this package, you are more than welcome to do so. Please, fork this project, make your changes following the usual PEP 8 code style, sprinkling with tests and submit a PR for review.
The Chaos Toolkit projects require all contributors must sign a Developer Certificate of Origin on each commit they would like to merge into the master branch of the repository. Please, make sure you can abide by the rules of the DCO before submitting a PR.
Develop
If you wish to develop on this project, make sure to install the development dependencies. But first, create a virtual environment and then install those dependencies.
$ pip install -r requirements-dev.txt -r requirements.txt
Then, point your environment to this directory:
$ python setup.py develop
Now, you can edit the files and they will be automatically be seen by your
environment, even when running from the chaos
command locally.
Test
To run the tests for the project execute the following:
$ pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for proofdock-chaos-azure-1.0.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | c393eedb820c423e7cc9a9c715a80f26e9f54418ecdba7c9fb245cb39da185c1 |
|
MD5 | 84fe560baf78466f1001e95dcb3fab8e |
|
BLAKE2b-256 | 0fba6dd2a444a0e44c5efcf2cc66d56a4888492b2f3464040b8587b4a72190ac |
Hashes for proofdock_chaos_azure-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 466c5a99f3963c13b1e824c87500c7fd4fe7d9d394bf88252adfea4df1c4a24a |
|
MD5 | a922d3a08aabad46936145680525cb4c |
|
BLAKE2b-256 | 35b8808aaa0fadc4660950142604098fbbca1319d1c72c7394289d1b63507509 |