Verify Kubernetes deployments match a version manifest with deep stability auditing. Checks convergence, revision consistency, and pod health.
Project description
kubernify
Verify Kubernetes deployments match a version manifest with deep stability auditing. Checks convergence, revision consistency, and pod health.
Features
- Manifest-driven verification - Provide a JSON manifest of expected versions; kubernify verifies the cluster matches
- Deep stability auditing - Goes beyond version checks: convergence, revision consistency, pod health, DaemonSet scheduling, Job completion
- Retry-until-converged loop - Waits for rollouts to complete rather than just snapshot-checking
- Repository-relative image parsing - Flexible component name extraction from any image registry format
- Comprehensive workload support - Deployments, StatefulSets, DaemonSets, Jobs, and CronJobs
- Zero-replica awareness - Verifies version from PodSpec even when HPA/KEDA has scaled to zero
- Structured JSON reports - Machine-readable output for CI/CD pipeline integration
Installation
pip install kubernify
Or with pipx for isolated CLI usage:
pipx install kubernify
Or with uv:
uv add kubernify
Quick Start
# Verify backend and frontend match expected versions in the "production" namespace
kubernify \
--context my-cluster-context \
--anchor my-app \
--namespace production \
--manifest '{"backend": "v1.2.3", "frontend": "v1.2.4"}'
kubernify will connect to the cluster, discover all matching workloads, verify their image versions against the manifest, run stability audits, and exit with code 0 (pass) or 1 (fail).
CLI Reference
kubernify [OPTIONS]
| Argument | Description | Default |
|---|---|---|
--context |
Kubeconfig context name to use for cluster connection. Mutually exclusive with --gke-project. When omitted, the active kubeconfig context is used automatically. |
From kubeconfig |
--gke-project |
GCP project ID — resolves the kube context from GKE-style context names (e.g., gke_my-project_us-central1_cluster-name). Mutually exclusive with --context. |
|
--anchor |
(required) The image path segment used as the anchor point for component name extraction. For example, given image registry.example.com/my-org/my-app/backend:v1.0, using --anchor my-app extracts the component name backend. See How Image Anchor Works. |
|
--manifest |
(required) JSON string containing the version manifest mapping component names to their expected versions, e.g. '{"backend": "v1.2.3", "frontend": "v2.0.0"}'. |
|
--component-aliases |
JSON string mapping manifest component names to their actual image names when they differ. Example: '{"foo": "bar-baz"}' means the manifest key foo corresponds to the container image named bar-baz. Multiple manifest keys can alias to the same image name — disambiguation is performed by matching the manifest key against the Kubernetes workload name (substring match). See Component Aliases. |
|
--namespace |
Kubernetes namespace to verify. Resolved automatically from kubeconfig context, in-cluster service account, or falls back to default. |
From kubeconfig context |
--required-workloads |
Comma-separated substring patterns for workloads that must exist in the namespace, independent of the manifest. Useful for ensuring critical workloads (e.g., infrastructure sidecars, operators) are present even if they aren't version-verified. Each pattern is matched against discovered workload names using substring containment (e.g., frontend matches my-app-frontend). Verification fails if any pattern has no match. |
|
--skip-containers |
Comma-separated substring patterns to skip during verification. Each pattern is matched against both container names and workload names using substring containment (e.g., backend matches my-app-backend). Skipped workloads are excluded from both version verification and stability audits. |
|
--min-uptime |
Minimum pod uptime in seconds for stability checks. Pods running for less than this duration are flagged as unstable. | 0 |
--restart-threshold |
Maximum acceptable container restart count. Containers exceeding this threshold are flagged as unstable. Use 0 to forbid any restarts, or -1 to skip the restart check entirely. |
3 |
--timeout |
Global timeout in seconds for the verification loop. The tool retries discovery and verification until all checks pass or this timeout is reached. Returns exit code 1 (FAIL) on timeout. |
300 |
--allow-zero-replicas |
Allow all workloads with zero running replicas to pass verification (version is still checked via the pod spec template). Mutually exclusive with --allow-zero-replicas-for. |
false |
--allow-zero-replicas-for |
Comma-separated list of workload name patterns allowed to have 0 running replicas (e.g., my-cronjob-worker,batch-processor). Uses substring matching: my-worker matches ns-123-my-worker. Mutually exclusive with --allow-zero-replicas. |
|
--dry-run |
Perform a single snapshot check against the current cluster state without waiting for convergence. Exits immediately with pass/fail result. | false |
--include-statefulsets |
Include StatefulSets in workload discovery. By default, only Deployments are inspected. | false |
--include-daemonsets |
Include DaemonSets in workload discovery. By default, only Deployments are inspected. | false |
--include-jobs |
Include Jobs and CronJobs in workload discovery. By default, only Deployments are inspected. | false |
--ignore-tombstone-pods |
When set, pods in phase Failed or Succeeded (OOMKilled, Evicted, Completed scripts) are excluded from per-pod health checks, revision consistency checks, and container image extraction for version verification. These "gray" pods do not cause health check failures, false revision inconsistencies, or stale version reports. The deployment availability check (available_replicas >= spec.replicas) always runs regardless of this flag. |
false |
--output-file |
Path to save the JSON verification report to a file. The report is always printed to stdout regardless of this flag. Parent directories are created automatically if they don't exist. |
Usage Examples
Basic Usage - Direct Kubeconfig Context
kubernify \
--context my-cluster-context \
--anchor my-app \
--namespace production \
--manifest '{"backend": "v1.2.3", "frontend": "v1.2.4"}'
GKE Shorthand - Resolve Context from GCP Project
kubernify \
--gke-project my-gke-project-123456 \
--anchor my-app \
--namespace production \
--manifest '{"backend": "v1.2.3", "frontend": "v1.2.4"}'
In-Cluster - Running Inside a Kubernetes Pod
# No --context needed; auto-detects in-cluster config and namespace
kubernify \
--anchor my-app \
--manifest '{"backend": "v1.2.3", "frontend": "v1.2.4"}'
Full-Featured - All Options
kubernify \
--context my-cluster-context \
--anchor my-app \
--namespace production \
--manifest '{"backend": "v1.2.3", "frontend": "v1.2.4", "worker": "v1.2.3"}' \
--required-workloads "backend, frontend, worker" \
--skip-containers "istio-proxy, envoy, fluent-bit" \
--include-statefulsets \
--include-daemonsets \
--include-jobs \
--min-uptime 120 \
--restart-threshold 5 \
--ignore-tombstone-pods \
--timeout 600 \
--allow-zero-replicas \
--output-file report.json
# OR selectively:
# --allow-zero-replicas-for "worker, cron-handler"
Dry Run - Snapshot Check Without Waiting
kubernify \
--context my-cluster-context \
--anchor my-app \
--manifest '{"backend": "v1.2.3"}' \
--dry-run
Save Report to File
kubernify \
--context my-cluster-context \
--anchor my-app \
--manifest '{"backend": "v1.2.3"}' \
--output-file /tmp/kubernify-report.json
The report is always printed to stdout. When --output-file is provided, it is additionally saved to the specified path. Parent directories are created automatically.
CI/CD Integration - GitHub Actions
jobs:
verify-deployment:
runs-on: ubuntu-latest
steps:
- name: Set up kubeconfig
run: |
echo "${{ secrets.KUBECONFIG }}" > /tmp/kubeconfig
export KUBECONFIG=/tmp/kubeconfig
- name: Install kubernify
run: pip install kubernify
- name: Verify deployment
run: |
kubernify \
--context ${{ secrets.KUBE_CONTEXT }} \
--anchor my-app \
--manifest '${{ steps.build.outputs.manifest }}' \
--timeout 600 \
--min-uptime 60
Programmatic Usage
kubernify can be used as a Python library for custom verification workflows:
from kubernify import __version__, VerificationStatus
from kubernify.kubernetes_controller import KubernetesController
from kubernify.workload_discovery import WorkloadDiscovery
from kubernify.cli import construct_component_map, verify_versions
controller = KubernetesController(context="my-cluster")
discovery = WorkloadDiscovery(k8s_controller=controller)
workloads, _ = discovery.discover_cluster_state(namespace="production")
component_map = construct_component_map(
workloads=workloads,
manifest={"backend": "v1.2.3"},
repository_anchor="my-app",
)
results = verify_versions(manifest={"backend": "v1.2.3"}, component_map=component_map)
if results.errors:
print(f"Verification failed: {results.errors}")
How Image Anchor Works
kubernify uses a repository-relative anchor to extract component names from container image paths. The --anchor argument specifies the path segment after which the component name is derived.
Image: registry.example.com/my-org-foo/my-app-bar/backend:v1.2.3-x
└──── registry ─────┘ └─ org ─┘ └ anchor ┘└ comp.┘└─ tag ─┘
More examples:
| Image | --anchor |
Extracted Component |
|---|---|---|
registry.example.com/my-org/my-app/backend:v1.2.3 |
my-app |
backend |
registry.example.com/my-org/my-app/api/server:v2.0.0 |
my-app |
api/server |
gcr.io/my-project/my-app/worker:v1.0.0 |
my-app |
worker |
The extracted component name is then matched against the keys in your --manifest JSON to verify the correct version is deployed.
Component Aliases
Use --component-aliases when a manifest component name differs from the container image name extracted by the anchor.
Basic Alias (One-to-One)
If your manifest uses the key foo but the container image is named bar-baz:
kubernify \
--anchor my-app \
--manifest '{"foo": "v1.0.0", "backend": "v2.0.0"}' \
--component-aliases '{"foo": "bar-baz"}'
This tells kubernify: when you see image bar-baz, map it to the manifest key foo.
Shared Image Alias (Many-to-One)
Multiple manifest components can share the same container image name. kubernify disambiguates by matching each manifest key against the Kubernetes workload name (substring match).
For example, if both ingest and process use the same shared-svc image but run as separate workloads:
kubernify \
--anchor my-app \
--manifest '{"ingest": "v1.0.0", "process": "v1.0.0"}' \
--component-aliases '{"ingest": "shared-svc", "process": "shared-svc"}' \
--include-statefulsets
Given these workloads in the cluster:
- Deployment
my-app-123-ingest→ imageshared-svc:v1.0.0→ mapped to manifest keyingest(because"ingest"is a substring of"my-app-123-ingest") - StatefulSet
my-app-123-process-node→ imageshared-svc:v1.0.0→ mapped to manifest keyprocess(because"process"is a substring of"my-app-123-process-node")
Resolution priority when multiple candidates exist for the same image:
- If only one candidate → use it directly
- If multiple candidates → pick the one whose manifest key is a substring of the workload name
- If no candidate matches the workload name → fall back to the raw image component name (if it's in the manifest)
- If nothing matches → the workload is skipped (not mapped to any manifest key)
Exit Codes
| Code | Meaning | Description |
|---|---|---|
0 |
PASS | All workloads match the manifest and pass stability audits |
1 |
FAIL | One or more workloads have version mismatches, stability issues, or the verification timed out |
Report Output
kubernify outputs a structured JSON report to stdout. Use --output-file to additionally save the report to a file. The report contains:
timestamp— ISO 8601 UTC timestamp of report generationcontext— Kubeconfig context name of the verified clusternamespace— Kubernetes namespace that was inspectedstatus— Overall verification status (PASSorFAIL)summary— Aggregated counts (see below)details— Per-component verification details
Summary Fields
| Field | Description |
|---|---|
total_components |
Total number of components in the manifest |
passing_components |
Components in PASS state (version match and stable workloads) |
failed_components |
Total components in FAIL state (version mismatch or stability failure) |
missing_components |
Components in the manifest not found in the cluster |
missing_workloads |
Expected workloads not found during discovery |
version_mismatched_components |
Components where at least one workload has a version mismatch |
unstable_workloads |
Individual workloads with stability audit errors (pods not ready, convergence issues, etc.) |
skipped_containers |
Containers excluded from verification by skip patterns |
Component Details
Each component in details contains:
status—PASSorFAIL. A component isFAILif it has version mismatches OR stability errors.errors— List of version-level error messagesworkloads— List of workloads with failures (only workloads with issues are included)
Each workload entry contains:
name— Kubernetes workload nametype— Workload type (Deployment, StatefulSet, DaemonSet, Job)container— Container nameversion_error— Version mismatch error (null if version matches)stability— Stability audit result with boolean checks and error list
Stability Flags
Each workload's stability object contains the following fields:
| Flag | Description |
|---|---|
converged |
Whether the controller has processed the latest spec changes (observedGeneration >= generation). Applies to Deployments, StatefulSets, and DaemonSets. Always true for Jobs and CronJobs. |
revision_consistent |
Whether all pods have the expected revision hash (pod-template-hash for Deployments, controller-revision-hash for StatefulSets/DaemonSets). Detects stale pods from previous rollouts. Always true for Jobs and CronJobs. |
pods_healthy |
Whether all pods are Ready, not terminating, within restart thresholds, and not in error states (CrashLoopBackOff, ImagePullBackOff). Also checks minimum uptime if configured via --min-uptime. |
scheduling_complete |
Whether DaemonSet scheduling is satisfied (available and updated pods >= desired count). Always true for non-DaemonSet workloads. |
job_complete |
Whether a Job has succeeded without exceeding its backoff limit. Always true for non-Job workloads. |
errors |
List of specific error messages explaining why any of the above checks failed. Empty when all checks pass. |
Example Output
{
"timestamp": "2025-01-15T10:30:00.000000+00:00",
"context": "my-cluster-context",
"namespace": "production",
"status": "FAIL",
"summary": {
"total_components": 2,
"passing_components": 1,
"failed_components": 1,
"missing_components": 0,
"missing_workloads": 0,
"version_mismatched_components": 0,
"unstable_workloads": 1,
"skipped_containers": 0
},
"details": {
"frontend": {
"status": "PASS",
"errors": [],
"workloads": []
},
"backend": {
"status": "FAIL",
"errors": [],
"workloads": [
{
"name": "my-app-backend",
"type": "Deployment",
"container": "backend",
"version_error": null,
"stability": {
"converged": true,
"revision_consistent": true,
"pods_healthy": false,
"scheduling_complete": true,
"job_complete": true,
"errors": [
"Pod my-app-backend-7f8b9c6d4-x2k9m is not Ready",
"Deployment availability insufficient: 0/1 pods available (0 ready; tombstone pods excluded by Kubernetes controller)"
]
}
}
]
}
}
}
Note:
version_mismatched_componentscounts only components with version verification failures.failed_componentscounts all components in FAIL state, including those that passed version verification but have unstable workloads. A component's status isFAILif either its version verification failed or any of its workloads have stability errors.
Prerequisites
Python
- Python >= 3.10
For GKE Users
If using --gke-project for automatic GKE context resolution:
- Install the Google Cloud SDK
- Install the GKE auth plugin:
gcloud components install gke-gcloud-auth-plugin
- Authenticate:
gcloud auth login gcloud container clusters get-credentials CLUSTER_NAME --project PROJECT_ID
RBAC Permissions
kubernify requires read-only access to workloads and pods. Apply the following RBAC configuration:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: kubernify-reader
namespace: <namespace>
rules:
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets", "daemonsets", "replicasets"]
verbs: ["get", "list"]
- apiGroups: ["batch"]
resources: ["jobs", "cronjobs"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kubernify-reader-binding
namespace: <namespace>
subjects:
- kind: ServiceAccount
name: kubernify
namespace: <namespace>
roleRef:
kind: Role
name: kubernify-reader
apiGroup: rbac.authorization.k8s.io
Contributing
Contributions are welcome! Please see CONTRIBUTING.md for development setup, coding standards, and the PR process.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kubernify-1.1.3.tar.gz.
File metadata
- Download URL: kubernify-1.1.3.tar.gz
- Upload date:
- Size: 117.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7aba36344701ae308458953406c38a1cab3be080c2f698082d560c0cc9acd48b
|
|
| MD5 |
b9bf8bc3898855612b4afb7efa8779d3
|
|
| BLAKE2b-256 |
4ca3124800a152754c2279b3196e8c971dc049447d586adf6cd0d85b578f81fc
|
File details
Details for the file kubernify-1.1.3-py3-none-any.whl.
File metadata
- Download URL: kubernify-1.1.3-py3-none-any.whl
- Upload date:
- Size: 38.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.8 {"installer":{"name":"uv","version":"0.11.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c12619325082e8718d8d8f7b2f8db0ad66a42b7ea089d6ceee69c9639322023
|
|
| MD5 |
69b25bef5efe9aa7da3b71a5e7d6934f
|
|
| BLAKE2b-256 |
56d8e1b61bb80e43d84a7d11ca35cfce6bb4aea2f444093af4b97d6cf9673fc9
|