RHOAI tool kit for managing and upgrading RHOAI
Project description
RHOShift - OpenShift Operator Installation Toolkit
A comprehensive, enterprise-grade toolkit for managing OpenShift operators with enhanced stability features, automatic dependency resolution, and Red Hat OpenShift AI (RHOAI) integration.
๐ Table of Contents
- Features
- Enhanced Stability Features
- Supported Operators
- Installation
- Usage
- Advanced Usage
- Dependency Management
- RHOAI Integration
- Configuration
- Troubleshooting
- Contributing
โจ Features
๐ Core Functionality
- 7 Enterprise Operators: Complete operator stack for modern OpenShift deployments
- Enhanced Stability System: 3-tier stability levels with comprehensive error handling
- Automatic Dependency Resolution: Smart installation order with dependency detection
- Pre-flight Validation: Cluster readiness and permission verification
- Health Monitoring: Real-time operator status tracking and reporting
- Auto-recovery: Intelligent error classification and automatic retry logic
๐ก๏ธ Enterprise-Grade Reliability
- Comprehensive Error Handling: 59+ exception handlers throughout codebase
- Webhook Certificate Resilience: Automatic timing issue resolution for RHOAI
- Resource Conflict Detection: Prevention of operator namespace conflicts
- Smart Retry Logic: Exponential backoff with contextual error recovery
- Parallel Installation: Optimized performance for multiple operators
๐ง Advanced Integration
- RHOAI DSC/DSCI Management: Complete DataScienceCluster lifecycle control
- Kueue Management States: Dynamic DSC integration with Managed/Unmanaged modes
- KedaController Automation: Automatic KEDA controller creation and validation
- Configurable Timeouts: Flexible timing control for enterprise environments
๐ก๏ธ Enhanced Stability Features
RHOShift includes a comprehensive stability system designed for enterprise deployments:
Stability Levels
- ๐ข Enhanced (Default): Pre-flight checks + health monitoring + auto-recovery
- ๐ต Comprehensive: Maximum resilience with advanced error classification
- โช Basic: Standard installation with basic error handling
Pre-flight Validation
- โ Cluster connectivity and authentication
- โ Required permissions verification
- โ Resource quota validation
- โ Operator catalog accessibility
- โ Namespace conflict detection
Health Monitoring
- ๐ Real-time operator status tracking
- ๐ Multi-resource health validation
- ๐ Installation progress reporting
- โก Performance metrics and timing
Auto-recovery Features
- ๐ Intelligent retry mechanisms
- ๐ง Error classification (transient vs. permanent)
- โฐ Exponential backoff strategies
- ๐ ๏ธ Automatic resource cleanup and recreation
๐ฆ Supported Operators
| Operator | Package | Namespace | Channel | Dependencies |
|---|---|---|---|---|
| OpenShift Serverless | serverless-operator |
openshift-serverless |
stable |
None |
| Service Mesh | servicemeshoperator |
openshift-operators |
stable |
None |
| Authorino | authorino-operator |
openshift-operators |
stable |
None |
| cert-manager | openshift-cert-manager-operator |
cert-manager-operator |
stable-v1 |
None |
| Kueue | kueue-operator |
openshift-kueue-operator |
stable-v1.0 |
cert-manager |
| KEDA | openshift-custom-metrics-autoscaler-operator |
openshift-keda |
stable |
None |
| RHOAI/ODH | opendatahub-operator |
openshift-operators |
stable |
None |
๐ Installation
Quick Install
git clone https://github.com/mwaykole/O.git
cd O
pip install -e .
Verify Installation
rhoshift --help
rhoshift --summary
๐ป Usage
Basic Commands
# Install single operator with enhanced stability
rhoshift --serverless
# Install multiple operators with batch optimization
rhoshift --serverless --servicemesh --authorino
# Install with dependency resolution (Kueue + cert-manager)
rhoshift --kueue
# Install all operators
rhoshift --all
# Show detailed operator summary
rhoshift --summary
# Clean up all operators
rhoshift --cleanup
RHOAI with DSC/DSCI
# Install RHOAI with complete setup
rhoshift --rhoai \
--rhoai-channel=odh-nightlies \
--rhoai-image=brew.registry.redhat.io/rh-osbs/iib:1049242 \
--deploy-rhoai-resources
# Install RHOAI with Kueue integration
rhoshift --rhoai --kueue Managed \
--rhoai-channel=stable \
--rhoai-image=quay.io/rhoai/rhoai-fbc-fragment:rhoai-2.25-nightly \
--deploy-rhoai-resources
Kueue Management States
# Install Kueue as Managed (RHOAI controls it)
rhoshift --kueue Managed
# Install Kueue as Unmanaged (independent) - Default
rhoshift --kueue Unmanaged
rhoshift --kueue # Same as above
# Switch management states (updates existing DSC)
rhoshift --kueue Managed # Switch to Managed
rhoshift --kueue Unmanaged # Switch to Unmanaged
๐ง Advanced Usage
Enterprise Deployment
# Complete ML/AI stack with queue management
rhoshift --all --kueue Managed \
--rhoai-channel=stable \
--rhoai-image=brew.registry.redhat.io/rh-osbs/iib:1049242 \
--deploy-rhoai-resources \
--timeout=900
# High-availability setup with service mesh
rhoshift --serverless --servicemesh --keda --authorino
# Development environment setup
rhoshift --cert-manager --kueue Unmanaged --keda
Custom Configuration
# Custom timeouts and retries for enterprise clusters
rhoshift --all \
--timeout=1200 \
--retries=5 \
--retry-delay=15
# Custom oc binary path
rhoshift --serverless --oc-binary=/usr/local/bin/oc
# Verbose output for debugging
rhoshift --kueue Managed --verbose
๐ Dependency Management
RHOShift automatically handles operator dependencies:
Automatic Resolution
- Kueue โ cert-manager: Installing Kueue automatically includes cert-manager
- Installation Order: Dependencies installed first, primary operators second
- Conflict Detection: Prevents namespace and resource conflicts
Smart Validation
# This command installs BOTH cert-manager AND Kueue in correct order:
rhoshift --kueue
# Output:
# ๐ Pre-flight checks passed. Cluster is ready for installation.
# โ ๏ธ Missing dependency: kueue-operator requires openshift-cert-manager-operator
# ๐ Installing 2 operators with enhanced stability...
# โ
cert-manager installed successfully
# โ
kueue installed successfully
๐ค RHOAI Integration
DataScienceCluster Management
RHOShift provides complete DSC/DSCI lifecycle management:
# Create RHOAI with DSC/DSCI
rhoshift --rhoai --deploy-rhoai-resources
# RHOAI with Kueue integration
rhoshift --rhoai --kueue Managed --deploy-rhoai-resources
DSC Behavior
- Existing DSC: Automatically updates Kueue managementState
- No DSC: State applied when DSC is created via
--deploy-rhoai-resources - Webhook Resilience: Automatic handling of certificate timing issues
Output Examples
# When DSC exists and gets updated:
๐ Updating DSC with Kueue managementState: Managed
โ
Successfully updated DSC with Kueue managementState: Managed
# When no DSC exists:
โน๏ธ No existing DSC found. Kueue managementState will be applied when DSC is created.
โ๏ธ Configuration
CLI Options
Operator Selection:
--serverless Install OpenShift Serverless Operator
--servicemesh Install Service Mesh Operator
--authorino Install Authorino Operator
--cert-manager Install cert-manager Operator
--rhoai Install RHOAI Operator
--kueue [{Managed,Unmanaged}] Install Kueue with DSC integration
--keda Install KEDA (Custom Metrics Autoscaler)
--all Install all operators
--cleanup Clean up all operators
--summary Show operator summary
Configuration:
--oc-binary OC_BINARY Path to oc CLI (default: oc)
--retries RETRIES Max retry attempts (default: 3)
--retry-delay RETRY_DELAY Delay between retries (default: 10s)
--timeout TIMEOUT Command timeout (default: 300s)
RHOAI Options:
--rhoai-channel CHANNEL RHOAI channel (stable/odh-nightlies)
--rhoai-image IMAGE RHOAI container image
--raw RAW Enable raw serving (True/False)
--deploy-rhoai-resources Create DSC and DSCI
Environment Variables
export LOG_FILE_LEVEL=DEBUG # File logging level
export LOG_CONSOLE_LEVEL=INFO # Console logging level
Logging
- Location:
/tmp/rhoshift.log - Rotation: 10MB max size, 5 backup files
- Levels: DEBUG (file) / INFO (console)
- Colors: Supported in compatible terminals
๐ Troubleshooting
Common Issues
Permission Errors
# Verify cluster access
oc whoami
oc auth can-i create subscriptions -n openshift-operators
Installation Failures
# Check logs
tail -f /tmp/rhoshift.log
# Verify operator catalogs
oc get catalogsource -n openshift-marketplace
# Check with enhanced timeouts
rhoshift --kueue --timeout=900 --retries=5
Dependency Issues
# Verify dependencies are resolved
rhoshift --summary
# Manual dependency installation
rhoshift --cert-manager
rhoshift --kueue
RHOAI/DSC Issues
# Check DSC status
oc get dsc,dsci -A
# Verify webhook certificates
oc get pods -n opendatahub-operators
# Manual DSC creation
rhoshift --rhoai --deploy-rhoai-resources --timeout=900
Debug Mode
# Enable verbose output
rhoshift --all --verbose
# Check stability report
rhoshift --summary
๐ ๏ธ Development
Prerequisites
- Python 3.8+
- OpenShift CLI (oc)
- OpenShift cluster access
- cluster-admin privileges
Project Structure
rhoshift/
โโโ rhoshift/
โ โโโ cli/ # Command-line interface
โ โโโ logger/ # Logging system
โ โโโ utils/
โ โ โโโ operator/ # Operator management
โ โ โโโ resilience.py # Error handling & recovery
โ โ โโโ health_monitor.py # Health monitoring
โ โ โโโ stability_coordinator.py # Stability management
โ โ โโโ constants.py # Operator configurations
โ โโโ main.py # Entry point
โโโ scripts/
โ โโโ cleanup/ # Cleanup utilities
โ โโโ run_upgrade_matrix.sh # Upgrade testing
โโโ tests/ # Test suite
Running Tests
pytest tests/
๐ค Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Commit changes:
git commit -am 'Add feature' - Push to branch:
git push origin feature-name - Create Pull Request
Development Guidelines
- Follow Python PEP 8 standards
- Add tests for new features
- Update documentation
- Ensure backward compatibility
๐ License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
๐ Support
- Issues: GitHub Issues
- Documentation: This README and
--helpoutput - Logs:
/tmp/rhoshift.logfor detailed debugging
RHOShift - Enterprise-grade OpenShift operator management with enhanced stability and reliability features.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rhoshift-0.1.6.2.tar.gz.
File metadata
- Download URL: rhoshift-0.1.6.2.tar.gz
- Upload date:
- Size: 67.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7395d4659e92be18b9fe1890a916755d95dd7fee4b40e12379ef48c0274b8a93
|
|
| MD5 |
4be57db09dfa58f5fb9564cc05f9adc0
|
|
| BLAKE2b-256 |
5e45dcafc0802b8e21257e5e56db672c382c9b0041ce0430263d6815d7e450a0
|
File details
Details for the file rhoshift-0.1.6.2-py3-none-any.whl.
File metadata
- Download URL: rhoshift-0.1.6.2-py3-none-any.whl
- Upload date:
- Size: 68.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
359da4894bcd64e76e7fd070e968fce34bcd864ab7e6ff8840aaaed09885e904
|
|
| MD5 |
584e973e075afffc9af9015f9106e3d0
|
|
| BLAKE2b-256 |
25cda24ef51b03feec311eb75fbc40b46847b76b994c89919743f780c1eb0b54
|