Skip to main content

Generic framework for running health validation rules on OpenShift cluster nodes

Project description

In-Cluster Checks

CI codecov License: GPL v3 Python 3.12+

A generic framework for running health validation rules directly on OpenShift cluster nodes using oc debug.

Overview

This framework provides infrastructure for:

  • Running validation rules on cluster nodes via oc debug
  • Parallel execution of rules across multiple nodes
  • Secret filtering and output formatting
  • Prerequisite checking and domain orchestration
  • Insights-compatible JSON output

Originally developed as part of Red Hat's Pendrive project, this framework has been extracted as open-source to benefit the broader OpenShift community.

Features

  • Generic validation framework: Base classes for creating custom health check rules
  • OpenShift integration: Direct node access via oc debug with persistent connections
  • Domain organization: Group related rules into domains (hardware, network, linux, storage)
  • Parallel execution: Run rules concurrently across multiple nodes
  • Secret filtering: Automatic redaction of sensitive data from outputs
  • Extensible: Easy to add new rules and domains

Installation

pip install in-cluster-checks

Quick Start

First, ensure you're logged into your OpenShift cluster:

oc login https://api.your-cluster.com:6443

Then run the checks:

# Run all checks (output saved to ./cluster-checks.json)
openshift-checks --output ./cluster-checks.json

# Run with debug logging
openshift-checks --log-level DEBUG

# Debug a specific rule (disables secret filtering)
openshift-checks --debug-rule "check_disk_usage"

# List available domains
openshift-checks --list-domains

# List all available rules
openshift-checks --list-rules

Programmatic Usage

You can also use the framework programmatically in your Python code:

from in_cluster_checks.runner import InClusterCheckRunner
from pathlib import Path

# Create runner with default configuration
runner = InClusterCheckRunner()

# Or customize configuration
runner = InClusterCheckRunner(
    max_workers=75,           # Maximum concurrent workers (default: 50)
    filter_secrets=True,      # Filter sensitive data from output (default: True)
    debug_rule_flag=False,    # Enable debug mode (default: False)
    debug_rule_name="",       # Specific rule to debug (default: "")
)

# Run checks and save results
output_path = Path("./results/cluster-checks.json")
runner.run(output_path=output_path)

Configuration Options

  • max_workers (int, default: 50): Maximum number of concurrent workers for parallel execution
  • filter_secrets (bool, default: True): Whether to filter sensitive data from output. Automatically disabled in debug mode
  • debug_rule_flag (bool, default: False): Enable debug mode with verbose output
  • debug_rule_name (str, default: ""): Name of specific rule to run in debug mode

Note: When debug mode is enabled:

  • Only the specified rule runs
  • Secret filtering is automatically disabled
  • JSON output is disabled (real-time console output only)
  • Detailed command execution logs are printed

Architecture

Core Components

  • Rule: Base class for validation rules
  • RuleDomain: Orchestrator for groups of related rules
  • Operator: Command execution abstraction
  • NodeExecutor: Execute commands on cluster nodes via oc debug
  • LoggerInterface: Pluggable logging abstraction

Built-in Domains

  • Hardware: Disk usage, memory, CPU, temperature validation
  • Network: OVS, DNS, bonding checks
  • Linux: Systemd, SELinux, clock synchronization
  • Storage: Storage validation rules
  • Hardware/Firmware Details: Informational collectors for hardware inventory

Creating Custom Rules

from in_cluster_checks.core.rule import Rule
from in_cluster_checks.core.rule_result import RuleResult
from in_cluster_checks.utils.enums import Status, Objectives

class MyCustomRule(Rule):
    """Example custom validation rule."""

    objective_hosts = [Objectives.ALL_NODES]

    def set_document(self):
        self.unique_name = "my_custom_rule"
        self.title = "My Custom Validation Rule"

    def run_rule(self):
        # Run validation logic
        return_code, stdout, stderr = self.run_cmd("my-command")

        if return_code == 0:
            return RuleResult.passed("Validation passed")
        else:
            return RuleResult.failed(f"Validation failed: {stderr}")

Requirements

  • Python 3.9+
  • OpenShift CLI (oc) installed and configured
  • Access to OpenShift cluster

Development

# Clone repository
git clone https://github.com/sprizend-rh/in-cluster-checks.git
cd in-cluster-checks

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run pre-commit checks
pre-commit run --all-files

License

GNU General Public License v3.0 or later

See LICENSE for full text.

Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues.

Related Projects

  • Pendrive: Red Hat's on-premise Insights validation tool (internal)
  • OpenShift: Container orchestration platform

Acknowledgments

This framework was extracted from Red Hat's Pendrive project. The core validation infrastructure is generic and contains no confidential logic, making it suitable for open-source release to benefit the wider OpenShift community.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

in_cluster_checks-0.1.1.tar.gz (107.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

in_cluster_checks-0.1.1-py3-none-any.whl (130.0 kB view details)

Uploaded Python 3

File details

Details for the file in_cluster_checks-0.1.1.tar.gz.

File metadata

  • Download URL: in_cluster_checks-0.1.1.tar.gz
  • Upload date:
  • Size: 107.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for in_cluster_checks-0.1.1.tar.gz
Algorithm Hash digest
SHA256 da99ead0ce26fc1b1c0c66fdfc65acde9546ca53cd5ee18e3eaf277210b68923
MD5 d15c1bfe8ff5ad16cc8fd2891c2b80a4
BLAKE2b-256 be1250c9dce5096ec10ee1693b92acdb39a31cf6fc9b4f971d7fda4b118177e8

See more details on using hashes here.

Provenance

The following attestation bundles were made for in_cluster_checks-0.1.1.tar.gz:

Publisher: publish.yml on sprizend-rh/in-cluster-checks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file in_cluster_checks-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for in_cluster_checks-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 59b3045ef986b43afdd46f5adcceb75d03fbe213f926afdbc6a158cd5130cf3a
MD5 a31ff89c0b93d9320810f6ed7977e092
BLAKE2b-256 21be8bed81f1d8e7d63c58029375d28288355affed681faed84718460c68d4a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for in_cluster_checks-0.1.1-py3-none-any.whl:

Publisher: publish.yml on sprizend-rh/in-cluster-checks

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page