Custom metrics exporter for Flux in Kubernetes
Project description
Flux Metrics API
This is an experiment to create a metrics API for Kubernetes that can be run directly from the Flux leader broker pod. We made this after creating prometheus-flux and wanting a more minimalist design. I'm not even sure it will work, but it's worth a try!
Usage
Install
You can install from pypi or from source:
$ python -m venv env
$ source env/bin/activate
$ pip install flux-metrics-api
# or
$ git clone https://github.com/converged-computing/flux-metrics-api
$ cd flux-metrics-api
$ pip install .
# you can also do "pip install -e ."
This will install the executable to your path, which might be your local user bin:
$ which flux-metric-api
/home/vscode/.local/bin/flux-metrics-api
Note that the provided .devcontainer includes an environment for VSCode where you have Flux and can install this and use ready to go!
Start
You'll want to be running in a Flux instance, as we need to connect to the broker handle.
$ flux start --test-size=4
And then start the server. This will use a default port and host (0.0.0.0:8443) that you can customize if desired.
$ flux-metrics-api start
# customize the port or host
$ flux-metrics-api start --port 9000 --host 127.0.0.1
SSL
If you want ssl (port 443) you can provide the path to a certificate and keyfile:
$ flux-metrics-api start --ssl-certfile /etc/certs/tls.crt --ssl-keyfile /etc/certs/tls.key
An example of a full command we might run from within a pod:
$ flux-metrics-api start --port 8443 --ssl-certfile /etc/certs/tls.crt --ssl-keyfile /etc/certs/tls.key --namespace flux-operator --service-name custom-metrics-apiserver
On the fly custom metrics!
If you want to provide custom metrics, you can write a function in an external file that we will read it and add to the server. As a general rule:
- The name of the function will be the name of the custom metric
- You can expect the only argument to be the flux handle
- You'll need to do imports within your function to get them in scope
This likely can be improved upon, but is a start for now! We provide an example file. As an example:
$ flux-metrics-api start --custom-metric ./example/custom-metrics.py
And then test it:
$ curl -s http://localhost:8443/apis/custom.metrics.k8s.io/v1beta2/namespaces/flux-operator/metrics/my_custom_metric_name | jq
{
"items": [
{
"metric": {
"name": "my_custom_metric_name"
},
"value": 4,
"timestamp": "2023-06-01T01:39:08+00:00",
"windowSeconds": 0,
"describedObject": {
"kind": "Service",
"namespace": "flux-operator",
"name": "custom-metrics-apiserver",
"apiVersion": "v1beta2"
}
}
],
"apiVersion": "custom.metrics.k8s.io/v1beta2",
"kind": "MetricValueList",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta2"
}
}
See --help
to see other options available.
Endpoints
Metric
GET /apis/custom.metrics.k8s.io/v1beta2/namespaces//metrics/<metric_name>
Here is an example to get the "node_up_count" metric:
curl -s http://localhost:8443/apis/custom.metrics.k8s.io/v1beta2/namespaces/flux-operator/metrics/node_up_count | jq
{
"items": [
{
"metric": {
"name": "node_up_count"
},
"value": 2,
"timestamp": "2023-05-31T04:44:57+00:00",
"windowSeconds": 0,
"describedObject": {
"kind": "Service",
"namespace": "flux-operator",
"name": "custom-metrics-apiserver",
"apiVersion": "v1beta2"
}
}
],
"apiVersion": "custom.metrics.k8s.io/v1beta2",
"kind": "MetricValueList",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta2"
}
}
The following metrics are supported:
- node_up_count: number of nodes up in the MiniCluster
- node_free_count: number of nodes free in the MiniCluster
- node_cores_free_count: number of node cores free in the MiniCluster
- node_cores_up_count: number of node cores up in the MiniCluster
- job_queue_state_new_count: number of new jobs in the queue
- job_queue_state_depend_count: number of jobs in the queue in state "depend"
- job_queue_state_priority_count: number of jobs in the queue in state "priority"
- job_queue_state_sched_count: number of jobs in the queue in state "sched"
- job_queue_state_run_count: number of jobs in the queue in state "run"
- job_queue_state_cleanup_count: number of jobs in the queue in state "cleanup"
- job_queue_state_inactive_count: number of jobs in the queue in state "inactive"
Docker
We have a docker container, which you can customize for your use case, but it's more intended to be a demo. You can either build it yourself, or use our build.
$ docker build -t flux_metrics_api .
$ docker run -it -p 8443:8443 flux_metrics_api
or
$ docker run -it -p 8443:8443 ghcr.io/converged-computing/flux-metrics-api
Development
Note that this is implemented in Python, but (I found this after) we could also use Go. Specifically, I found this repository useful to see the spec format.
You can then open up the browser at http://localhost:8443/metrics/ to see the metrics!
😁️ Contributors 😁️
We use the all-contributors tool to generate a contributors graphic below.
Vanessasaurus 💻 |
License
HPCIC DevTools is distributed under the terms of the MIT license. All new contributions must be made under this license.
See LICENSE, COPYRIGHT, and NOTICE for details.
SPDX-License-Identifier: (MIT)
LLNL-CODE- 842614
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file flux-metrics-api-0.0.11.tar.gz
.
File metadata
- Download URL: flux-metrics-api-0.0.11.tar.gz
- Upload date:
- Size: 17.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/6.0.0 pkginfo/1.9.6 requests/2.29.0 requests-toolbelt/0.9.1 tqdm/4.65.0 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 28f59a61772fea164bd660b1da51470bbf1f9a5d8ef0ef3fd01cf0939947f827 |
|
MD5 | 9ec52986ad89e4dafd20e24f1e710919 |
|
BLAKE2b-256 | a452ba96c606284868923c1c67430a7dc2f43741ce956e82912021c62eec9447 |
File details
Details for the file flux_metrics_api-0.0.11-py3-none-any.whl
.
File metadata
- Download URL: flux_metrics_api-0.0.11-py3-none-any.whl
- Upload date:
- Size: 17.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.7.1 importlib_metadata/6.0.0 pkginfo/1.9.6 requests/2.29.0 requests-toolbelt/0.9.1 tqdm/4.65.0 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 051705583bb34a89cd311b5cec02129111d9247fef3f375553bb9ef1c6b6e1fc |
|
MD5 | 74f8c5709c6a8652c97d7ff8aacef67e |
|
BLAKE2b-256 | 020f1478b1fec6ff966fd2d54a01703a4e3e069c6d16914fbc74dddc2103255d |