A health check and RCA tool for kubernetes
Project description
unctl
Table of Contents
About The Project
unctl
is a versatile command-line tool designed to perform a wide range of checks and inspections on various components of your infrastructure. It provides a unified interface to assess the health and performance of different services and platforms, and goes beyond mere diagnosis.
Health Checks
This tools runs the health checks and provides a report. In order to get access to automated AI based diagnostics and remediations for these problems, go to https://unskript.com/.
List of Checks
Provider | Checks |
---|---|
k8s | 31 |
mysql | 1 |
k8s checks
Check | Service | Category | Severity | Description |
---|---|---|---|---|
Check if a k8s PVC is in Pending state. | pvc | Health | Critical | Alerts on pending PVCs, highlighting potential delays in provisioning persistent volume claims for all the namespaces |
Check if the k8s node is in Ready state. | node | Health | Critical | Ensure node health by examining readiness conditions, signaling failures if any issues are detected in the node's status |
Deployment has insufficient replicas. | deployment | Health | Critical | Validate Deployments for the correct number of available replicas, highlighting any discrepancies between desired and available counts |
Pod has a high restart count. | pod | Health | Critical | Identify pods for all the namespaces where certain containers have restarted more than 10 times, indicating potential instability concerns |
Pod is in CrashLoopBackOff state. | pod | Health | Critical | Identify pods with containers stuck in a CrashLoopBackOff state, highlighting potential issues impacting pod stability for all the namespaces |
Service has endpoints that are NotReady. | service | Health | Severe | Highlights when services have NotReady endpoints, indicating potential disruptions to service reliability for all the namespaces |
Service has no endpoints. | service | Health | Severe | Identify services with no associated endpoints, highlighting potential misconfigurations impacting service connectivity |
Analyzing HPAs, checking if scale targets exist and have resources | pod | HPA | High | Analyze optimal Horizontal Pod Autoscaler (HPA) configurations by ensuring associated resources (Deployments, ReplicationControllers, ReplicaSets, StatefulSets) have defined resource limits for effective auto-scaling |
Check for the existence of Ingress class, service and secrets for all the namespaces | ingress | Ingress | High | Ensure proper Ingress configurations by validating associated services, secrets, and ingress classes, flagging issues if there are missing elements or misconfigured settings for all the namespaces |
Check the existence of secret in Daemonset | daemonset | Daemonset, Secret | High | Ensure the presence of referenced Secrets in Daemonset volumes, reporting failures for any missing Secret within all the namespaces |
Check the existence of secret in Deployment | secret | Deployment | High | Ensure the presence of referenced Secrets in Deployment volumes, reporting failures for any missing Secret for all the namespaces |
Excessive Pods on Node | node | Resource Limits | High | Assesses nodes for excessive pod counts, flagging potential issues if pods near capacity thresholds based on CPU and memory resources |
Find Deployments with missing configmap | configmap | Deployment | High | Ensure the presence of referenced ConfigMaps in Deployment volumes, reporting failures for any missing ConfigMap for all the namespaces |
Find Pending Pods | pod | Health | High | Ensure that Pods are not in a Pending state due to scheduling issues or container creation failures, and report relevant details for diagnostics |
Find Pods with missing configmap | pod | Pod, ConfigMap | High | Ensure the presence of referenced ConfigMaps in Pod containers and volumes, reporting failures for any missing ConfigMap for all the namespaces |
Find Pods with missing secrets | pod | Pod, Secret | High | Ensure the presence of referenced Secrets in Pod containers, reporting failures for any missing Secret for all the namespaces |
Insufficient PIDs on Node | node | Performance | High | Check if the nodes have remaining PIDs less than a set threshold |
Kubernetes Node Out-of-Memory Check | node | Performance | High | Checks if any Kubernetes node is using more than 85% of its memory capacity. |
Validate configmap existence in Statefulset | statefulset | StatefulSet | High | Ensure the existence of referenced ConfigMaps in StatefulSet volume claims and template volumes, reporting failures for any missing ConfigMap for all the namespaces |
Validate cronjob starting deadline | cronjob | CronJob | High | Ensure CronJobs have a non-negative starting deadline, reporting failures for negative values for all the namespaces |
Validate existence of configmaps in daemonsets | daemonset | DaemonSet, ConfigMap | High | Ensure the presence of referenced ConfigMaps in Daemonset volumes, reporting failures for any missing ConfigMap for all the namespaces |
Verify StatefulSet has valid service | statefulset | StatefulSet | High | Verify StatefulSet's service reference, ensuring it points to an existing service in all the namespaces, reporting failures for non-existent services |
Verify StatefulSet has valid storageClass | statefulset | StatefulSet | High | Validate StatefulSet's storage class, ensuring it references existing storage classes in the namespace, reporting failures for non-existent ones |
Zero Scale Deployment Check | deployment | Availability | High | Verify that Deployments have a non-zero replica count, preventing unintentional scaling down to zero |
Check if Kubernetes services have matching pod labels | service | Configuration | Medium | This check validates if Kubernetes service selectors match pod labels. This ensures proper routing & discovery of pods. |
Pod template validation in DaemonSet | daemonset | Resource Management | Medium | Checks that the Pod template within a DaemonSet is configured correctly according to certain threshold values. |
Services Target Port Match | service | Diagnostic | Medium | This check identifies service ports that do not match their target ports |
Validate that network policies are in place and configured correctly | networkpolicy | Network Security | Medium | Verify Network Policy configurations, highlighting issues if policies allow traffic to all pods or if not applied to any specific pods |
Zero scale detected in statefulset | statefulset | Availability | Medium | Check to ensure that no StatefulSets are scaled to zero as it might hamper availability. |
Find unused DaemonSet | daemonset | DaemonSet, Cost, Resource Optimization | Low | Any DaemonSet that has been created but has no associated pods and remained unused for over 30 days. |
Validate cronjobs schedule and state | cronjob | CronJob | Low | Ensure CronJobs have valid schedules and are not suspended, reporting failures for any invalid schedules or suspended jobs for all the namespaces |
mysql checks
Check | Service | Category | Severity | Description |
---|---|---|---|---|
Checks max used connections | global | Connection, Thread | High | Checks max used connections reaching max count |
Built With
Getting Started
Prerequisites
- Python >= 3.10
Installation
- Get distibution on your machine:
- Run
pip
command to installunctl
from PyPIpip install unctl
- Run
Kubernetes
- (optional) Set
KUBECONFIG
variable to specific location other than default:export KUBECONFIG=<path to kube config file>
- Run unctl command to see list of options:
unctl k8s -h
MySQL
- unctl is using
~/.my.cnf
as config path. - Run unctl command to see list of options:
unctl mysql -h
Development
- Install poetry:
pip install poetry
- Enter virtual env:
poetry shell
- Install dependencies:
poetry install
- Run tool:
python unctl.py -h
- Format all files before commit changes:
black .
Testing
See the testing documentation here.
Release
For the release this repo is using Semantic Realese as automated process. To be able to generate changelogs we should keep using Conventional Commits practice. When PR merged to master
it uses squash and merge
with PR title for the commit message. This requires PR title
to be conventional:
feat(EN-4444): Add Button component
^ ^ ^
| | |__ Subject
| |_______ Scope
|____________ Type
When release job is running it will automatically bump up version depends on the changes:
BREAKING CHANGE: <message>
- creates new major versionfeat: <message>
- creates new minor versionfix or perf: <message>
- creates new patch version- All other tags will not create new release
Usage
unctl
% unctl -h
usage: unctl [-h] [-v] {k8s,mysql} ...
Welcome to unSkript CLI Interface
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
unctl available providers:
{k8s,mysql}
To see the different available options on a specific provider, run:
unctl {provider} -h|--help
Provider
% unctl {provider} -h
usage: unctl {provider} [-h] [-f] [-c CHECKS [CHECKS ...]] [--sort-by {object,check}] [--categories CATEGORIES [CATEGORIES ...]]
[--services SERVICES [SERVICES ...]] [-l] [--list-categories] [--list-services] [-e | --explain | --no-explain]
[-r | --remediate | --no-remediate]
options:
-h, --help show this help message and exit
-f, --failing-only Show only failing checks
-c CHECKS [CHECKS ...], --checks CHECKS [CHECKS ...]
Filter checks by IDs
--sort-by {object,check}
Sort results by 'object' (default) or 'check'
--categories CATEGORIES [CATEGORIES ...]
Filter checks by category
--services SERVICES [SERVICES ...]
Filter checks by services
-l, --list-checks List available checks
--list-categories List available categories
--list-services List available services
Licensed features:
These features available only in a licensed version.
-e, --explain, --no-explain
Explain failures
-r, --remediate, --no-remediate
Create remediation plan
Roadmap
- K8s checks - in progress
- MySQL checks - in progress
- ElasticSearch checks
- AWS checks
- GCP checks
Contact
Abhishek Saxena: abhishek@unskript.com
Official website: https://unskript.com/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file unctl-1.0.3.tar.gz
.
File metadata
- Download URL: unctl-1.0.3.tar.gz
- Upload date:
- Size: 40.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a1be661067289c8285bfa65ef2a956208ff06caa5eedd9d688e02c0080eca00e |
|
MD5 | 21e52efffb0735dd5366d1d94f47c17e |
|
BLAKE2b-256 | 61759746e3222ec7e96011b6fd21c6a0dea01943e6bd20576b869b5819d3bee6 |
File details
Details for the file unctl-1.0.3-py3-none-any.whl
.
File metadata
- Download URL: unctl-1.0.3-py3-none-any.whl
- Upload date:
- Size: 80.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6fd74474615af437ef4b0bddb1db1bb63b29510ee50666885e7dad84cd892522 |
|
MD5 | 298ebe90c5dafc2606bc1107c7ac8a4b |
|
BLAKE2b-256 | c20e44f88cce38b180a5cf7bb4f1bb877510e4e33397ed85ac34f6371f014dcb |