SLO generator
Project description
SLO Generator
slo-generator
is a Python package to define and compute Service Level Objectives, Error Budgets and Burn Rates on GCP.
Description
As a stand-alone CLI, the slo-generator
will:
-
Load SLO config and Error Budget Policy (see configuration)
-
Query timeseries from one of the supported metrics backend, for each query window listed in the Error Budget Policy.
-
Compute an SLO report for each query window, with the following information:
- Service Level Objective defined as
SLO (%) = GOOD_EVENTS / VALID_EVENTS
- Error Budget defined as
ERROR_BUDGET = 100 - SLO (%)
- Burn Rate defined as the speed at which we are burning the available error budget.
- Service Level Objective defined as
-
Export the SLO report to one of the supported exporters.
Configuration
SLO Configuration
The SLO configuration (JSON) defines our SLO, which metrics backend to query, what metrics to query, and the exporters config. An example configuration is available here.
Metrics backends
Metrics backends can be configured to specify where to fetch our metrics from.
The following metrics backends are currently supported:
- Stackdriver Monitoring
Support for more backends is planned for the future (TBA, feel free to send PRs !):
- Prometheus (soon)
- Grafana
- Stackdriver Logging
- Datadog
- Custom
Exporters
Exporters can be configured to send SLO Reports to a destination.
slo-generator
currently supports the following exporters:
- Cloud Pub/Sub for streaming export.
- Stackdriver Monitoring for exporting SLO metrics to Stackdriver Monitoring (e.g: Burn Rate metric, SLO/SLI metric).
- BigQuery for exporting SLO report to BigQuery for deep analytics.
Error Budget Policy
The Error Budget policy (JSON) defines the different time windows to query (steps), the alerting Burn Rate Threshold, and notification settings. This policy is written as a list, allowing us to set different burn rates based on the query window.
For instance:
- Step 1: Window is "last 1 hour", set an alert when burn rate > 9
- Step 2: Window is "last 12 hours", set an alert when burn rate > 3
- Step 3: Window is "last 7 days", set an alert when burn rate > 1.5
- Step 4: Window is "last 28 days", set an alert when burn rate > 1
An example configuration is available here.
Basic usage (local)
Requirements
- Python 3
- gcloud SDK installed
Installation
slo-generator
is published on PyPI. To install it, run:
pip install slo-generator
Write an SLO config file
See slo_linear.json
and slo_exponential.json
files in the tests/unit/fixtures/
directory to write SLO definition files.
Write an Error Budget Policy file
See error_budget_policy.json
files in the tests/unit/fixtures/
directory to write
Error Budget Policy files.
Run the slo-generator
slo-generator --slo-config=<SLO_CONFIG_FILE> --error-budget-policy=<ERROR_BUDGET_POLICY>
-
<SLO_CONFIG_FILE>
is the SLO config JSON file. -
<ERROR_BUDGET_POLICY>
is the Error Budget Policy JSON file.
Use slo-generator --help
to list all available arguments.
Usage in pipelines
Once the SLO measurement has been tested locally, it's a good idea to deploy a pipeline that will automatically compute SLO reports. This pipeline can be triggered on a schedule, or by specific events.
A few pipeline examples are given below.
Cloud Functions
slo-generator
is frequently used as part of an SLO Reporting Pipeline made of:
- A Cloud Scheduler triggering an event every X minutes.
- A PubSub topic, triggered by the Cloud Scheduler event.
- A Cloud Function, triggered by the PubSub topic, running
slo-generator
. - A PubSub topic to stream computation results.
Other components can be added to make results available to other destinations:
- A Cloud Function to archive SLO reports to BigQuery.
- A Cloud Function to export SLO and Burn Rate metrics (e.g: to Stackdriver Monitoring)
- A Stackdriver Monitoring Policy to alert on high budget Burn Rates.
Benefits:
-
Frequent SLO / Error Budget / Burn rate reporting (max 1 every minute) with Cloud Scheduler.
-
Real-time visualization by streaming results to DataStudio.
-
Historical analytics by analyzing SLO data in Bigquery.
-
Real-time alerting by setting up Stackdriver Monitoring alerts based on wanted SLOs.
The corresponding Terraform module to automate this setup can be found here.
Cloud Build
slo-generator
can also be triggered in a Cloud Build pipeline. This can be useful if we want to compute some SLOs as part of a release process (e.g: to calculate a metric on each git
commit or push)
To do so, you need to build an image for the slo-generator
and push it to Google Container Registry
in your project.
To build / push the image, run:
gcloud builds submit --config=cloudbuild.yaml . -s _PROJECT_NAME=<YOUR_PROJECT_NAME>
Once the image is built, you can call the Terraform generator using the following snippet in your cloudbuild.yaml
:
---
steps:
- name: gcr.io/${_PROJECT_NAME}/slo-generator
args: ['--slo-config', '${_SLO_CONFIG_FILE}', '--error-budget-policy', '${_ERROR_BUDGET_POLICY_FILE}']
Then, in another repo containing your SLO definitions, simply run the pipeline, substituting the needed variables:
gcloud builds submit . --config=cloudbuild.yaml --substitutions \
_SLO_CONFIG_FILE=<YOUR_SLO_CONFIG_FILE> \
_ERROR_BUDGET_POLICY_FILE=<_ERROR_BUDGET_POLICY_FILE> \
_WORKSPACE=<ENV>
If your repo is a Cloud Source Repository, you can also configure a trigger for Cloud Build, so that the pipeline is run automatically when a commit is made:
resource "google_cloudbuild_trigger" "dev-trigger" {
trigger_template {
branch_name = "dev"
repo_name = "my-repo"
}
substitutions = {
_SLO_CONFIG_FILE = "slo.json"
_ERROR_BUDGET_POLICY_FILE = "error_budget_policy.json"
_WORKSPACE = "dev"
}
filename = "cloudbuild.yaml"
}
resource "google_cloudbuild_trigger" "prod-trigger" {
trigger_template {
branch_name = "master"
repo_name = "my-repo"
}
substitutions = {
_SLO_CONFIG_FILE = "slo.json"
_ERROR_BUDGET_POLICY_FILE = "error_budget_policy.json"
}
filename = "cloudbuild.yaml"
}
Jenkins
TODO
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for slo_generator-0.1.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5549ad3df54565dd1db50bfaa605f643ea5cbc7be310650bfd5a288ce7688923 |
|
MD5 | c2555d7f0b62d2642a8315fa94169759 |
|
BLAKE2b-256 | fbbb088ec365aeab07da4ea48d35a262da12edfbb3d0627c6ac9bb30028e9b08 |