RHOAI tool kit for managing and upgrading RHOAI
Project description
RHOAI Tool Kit
A comprehensive toolkit for managing and upgrading Red Hat OpenShift AI (RHOAI) installations with parallel installation support.
๐ Table of Contents
- Features
- Project Structure
- Installation
- Usage
- Logging
- Configuration
- Development
- Troubleshooting
- Contributing
โจ Features
- Install single or multiple OpenShift operators
- Parallel installation for faster deployments
- Configurable timeouts and retries
- Comprehensive logging system
- Supports:
- Serverless Operator
- Service Mesh Operator
- Authorino Operator
- cert-manager Operator (Kueue dependency)
- RHOAI Operator
- Kueue Operator with DSC Integration
- KEDA (Custom Metrics Autoscaler) Operator
- Automatic Dependency Resolution: Installs required operators in correct order
- Smart Validation: Pre-installation compatibility and conflict detection
- ๐ Kueue DSC Integration: Automatically updates RHOAI DataScienceCluster with Kueue management state
๐ Project Structure
rhoshift/
โโโ rhoshift/ # Main package directory
โ โโโ __init__.py
โ โโโ main.py # CLI entry point
โ โโโ cli/ # Command-line interface
โ โ โโโ __init__.py
โ โ โโโ args.py # Argument parsing
โ โ โโโ commands.py # Command implementations
โ โโโ logger/ # Logging utilities
โ โ โโโ __init__.py
โ โ โโโ logger.py # Logging configuration
โ โโโ utils/ # Core utilities
โ โโโ __init__.py
โ โโโ constants.py # Constants and configurations
โ โโโ operator.py # Operator management
โ โโโ utils.py # Utility functions
โโโ run_upgrade_matrix.sh # Upgrade matrix execution script
โโโ upgrade_matrix_usage.md # Upgrade matrix documentation
โโโ pyproject.toml # Project dependencies and configuration
โโโ README.md # This document
๐ Components
Core Components
- CLI: Command-line interface for operator management
- Logger: Logging configuration and utilities (logs to
/tmp/rhoshift.log) - Utils: Core utilities and operator management logic
RHOAI Components
- RHOAI Upgrade Matrix: Utilities for testing RHOAI upgrades
- Upgrade Matrix Scripts: Execution and documentation for upgrade testing
Maintenance Scripts
- Cleanup Scripts: Utilities for cleaning up operator installations
- Worker Node Scripts: Utilities for managing worker node configurations
๐ Installation
- Clone the repository:
git clone https://github.com/mwaykole/O.git
cd O
- Install dependencies:
pip install -e .
- Verify installation:
rhoshift --help
๐ง New CLI Options
rhoshift --help
usage: rhoshift [-h] [--serverless] [--servicemesh] [--authorino] [--cert-manager]
[--rhoai] [--kueue [{Managed,Unmanaged}]] [--keda] [--all] [--cleanup]
[--deploy-rhoai-resources] [--summary] [--oc-binary OC_BINARY]
[--retries RETRIES] [--retry-delay RETRY_DELAY] [--timeout TIMEOUT]
[--rhoai-channel RHOAI_CHANNEL] [--raw RAW] [--rhoai-image RHOAI_IMAGE]
Operator Selection:
--serverless Install OpenShift Serverless Operator
--servicemesh Install Service Mesh Operator
--authorino Install Authorino Operator
--cert-manager Install cert-manager Operator (latest v1.16.1)
--rhoai Install RHOAI Operator
--kueue [{Managed,Unmanaged}] Install Kueue Operator with DSC managementState (default: Unmanaged)
--keda Install KEDA (Custom Metrics Autoscaler) Operator
--all Install all operators
--cleanup Clean up all RHOAI, serverless, servicemesh, Authorino operators
--deploy-rhoai-resources Create DSC and DSCI with RHOAI installation
--summary Show detailed summary of all supported operators and versions
๐ป Usage
Basic Commands
# Install single operator
rhoshift --serverless
# Install multiple operators
rhoshift --serverless --servicemesh
# Install cert-manager operator
rhoshift --cert-manager
# Install Kueue operator with default managementState (Unmanaged)
# Automatically installs cert-manager dependency
rhoshift --kueue
# Install Kueue operator with specific managementState in DSC
rhoshift --kueue Managed # Sets Kueue as Managed in DSC
rhoshift --kueue Unmanaged # Sets Kueue as Unmanaged in DSC
# Install KEDA (Custom Metrics Autoscaler) operator
rhoshift --keda
# Install RHOAI with raw configuration
rhoshift --rhoai --rhoai-channel=<channel> --rhoai-image=<image> --raw=True
# Install RHOAI with Serverless configuration
rhoshift --rhoai --rhoai-channel=<channel> --rhoai-image=<image> --raw=False --all
# Install all operators (Kueue will be set to Unmanaged in DSC)
rhoshift --all
# Create DSC and DSCI with RHOAI operator installation
rhoshift --rhoai --deploy-rhoai-resources
# Clean up all operators
rhoshift --cleanup
๐ Operator Dependencies & Validation
The tool automatically handles operator dependencies and provides smart validation:
Automatic Dependency Resolution
- Kueue requires cert-manager: Installing Kueue automatically includes cert-manager
- Dependencies are installed in the correct order to prevent failures
- Missing dependencies are automatically detected and added
# This command will install BOTH cert-manager AND Kueue (in correct order)
# Kueue will be set to Unmanaged in DSC (if DSC exists)
rhoshift --kueue
# You'll see output like:
# ๐ฆ Auto-adding dependency: cert-manager
# Installing 2 operators in order: cert-manager โ kueue
# ๐ Updating DSC with Kueue managementState: Unmanaged
# โ
Successfully updated DSC with Kueue managementState: Unmanaged
Smart Validation
- Compatibility Checking: Warns about potential operator conflicts
- Namespace Validation: Detects if operators conflict in shared namespaces
- Pre-Installation Validation: Catches issues before installation starts
# Example validation warnings:
# โ ๏ธ Note: Kueue and KEDA may have resource conflicts. Monitor for admission webhook issues.
# โ ๏ธ Installation order will be adjusted for dependencies: cert-manager โ kueue
Supported Dependencies
| Primary Operator | Required Dependencies |
|---|---|
| Kueue | cert-manager |
Note: When installing Kueue individually (
--kueue), you will see dependency warnings. For automatic dependency installation, use batch mode (--cert-manager --kueue) or install dependencies manually first.
๐ฏ Kueue DSC Integration
New Feature: Kueue operator installation now automatically updates the RHOAI DataScienceCluster (DSC) when a management state is specified.
Kueue Management States
Managed: RHOAI controls Kueue configuration and lifecycleUnmanaged: Kueue runs independently, not managed by RHOAI
Usage Examples
# Install Kueue as Managed (RHOAI controls it)
rhoshift --kueue Managed
# Install Kueue as Unmanaged (independent operation) - DEFAULT
rhoshift --kueue Unmanaged
rhoshift --kueue # Same as above
# Switch between states (updates existing DSC)
rhoshift --kueue Managed # Change to Managed
rhoshift --kueue Unmanaged # Change back to Unmanaged
Behavior
- DSC Exists: Automatically updates Kueue managementState in existing DSC
- No DSC: Shows info message that state will be applied when DSC is created
- Error Handling: Graceful warnings if DSC update fails
Output Examples
# When DSC exists and gets updated:
๐ Updating DSC with Kueue managementState: Unmanaged
โ
Successfully updated DSC with Kueue managementState: Unmanaged
# When no DSC exists:
โน๏ธ No existing DSC found. Kueue managementState will be applied when DSC is created.
Advanced Options
# Custom oc binary path
rhoshift --serverless --oc-binary /path/to/oc
# Custom timeout (seconds)
rhoshift --all --timeout 900
# Install queue management and auto-scaling operators together
# (cert-manager will be automatically installed as Kueue dependency)
rhoshift --kueue Managed --keda
# Install complete ML/AI stack with queue management
rhoshift --rhoai --kueue Managed --keda --rhoai-channel=stable --rhoai-image=quay.io/rhoai/rhoai-fbc-fragment:rhoai-2.25-nightly
# Show summary of all supported operators and their versions
rhoshift --summary
# Install only cert-manager for other uses
rhoshift --cert-manager
# Verbose output
rhoshift --all --verbose
Upgrade Matrix Testing
To run the upgrade matrix tests, you can use either method:
- Using the shell script:
./run_upgrade_matrix.sh [options] <current_version> <current_channel> <new_version> <new_channel>
- Using the Python command:
run-upgrade-matrix [options] <current_version> <current_channel> <new_version> <new_channel>
Options:
-s, --scenario: Run specific scenario(s) (serverless, rawdeployment, serverless,rawdeployment)--skip-cleanup: Skip cleanup before each scenario--from-image: Custom source image path--to-image: Custom target image path
Example:
# Using shell script
./run_upgrade_matrix.sh -s serverless -s rawdeployment 2.10 stable 2.12 stable
# Using Python command
run-upgrade-matrix -s serverless -s rawdeployment 2.10 stable 2.12 stable
๐ Logging
The toolkit uses a comprehensive logging system:
- Logs are stored in
/tmp/rhoshift.log - Console output shows INFO level and above
- File logging captures DEBUG level and above
- Automatic log rotation (10MB max size, 5 backup files)
- Colored output in supported terminals
To view logs:
tail -f /tmp/rhoshift.log
๐ง Configuration
Environment Variables
LOG_FILE_LEVEL: Set file logging level (default: DEBUG)LOG_CONSOLE_LEVEL: Set console logging level (default: INFO)
Command Options
--oc-binary: Path to oc CLI (default: oc)--retries: Max retry attempts (default: 3)--retry-delay: Delay between retries in seconds (default: 10)--timeout: Command timeout in seconds (default: 300)
๐ ๏ธ Development
Prerequisites
- Python 3.8 or higher
- OpenShift CLI (oc)
- Access to an OpenShift cluster
Running Tests
pytest tests/
๐ Troubleshooting
Common Issues
-
Operator Installation Fails
- Check cluster access:
oc whoami - Verify operator catalog:
oc get catalogsource - Check logs:
tail -f /tmp/rhoshift.log
- Check cluster access:
-
Permission Issues
- Ensure you have cluster-admin privileges
- Check namespace permissions
-
Timeout Errors
- Increase timeout:
--timeout 900 - Check cluster resources
- Increase timeout:
๐ค Contributing
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rhoshift-0.1.6.1.tar.gz.
File metadata
- Download URL: rhoshift-0.1.6.1.tar.gz
- Upload date:
- Size: 67.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca7e6645b612ad387021338250a07169c7f43951a082ceaa684592ef278c7cc5
|
|
| MD5 |
7f2337df4a1c0b794cae4b6c860dda4f
|
|
| BLAKE2b-256 |
6eb1e586e3fd281694da6313849de8cb40c8a784d53fb29cf0367a318787d840
|
File details
Details for the file rhoshift-0.1.6.1-py3-none-any.whl.
File metadata
- Download URL: rhoshift-0.1.6.1-py3-none-any.whl
- Upload date:
- Size: 67.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a41e0f54b2a1422623769315702ee938e33d77dde43a141b3df849fbca82893e
|
|
| MD5 |
9382896384f43132d401ce2f78989beb
|
|
| BLAKE2b-256 |
bfd70b985b5423ea437f917a7919b7f2a0df3c10ee686787855bcc8e95ad48c3
|