Skip to main content

Rancher plugin for Waldur Site Agent

Project description

Waldur Site Agent - Rancher Plugin

This plugin enables integration between Waldur Site Agent and Rancher for Kubernetes project management with optional Keycloak user group integration.

Features

  • Rancher Project Management: Creates and manages Rancher projects with resource-specific naming
  • OIDC Group Integration: Creates hierarchical Keycloak groups that map to Rancher project roles via OIDC
  • Automatic User Management: Adds/removes users from Keycloak groups based on Waldur project membership
  • Resource Quotas: Sets CPU and memory limits as Rancher project quotas
  • Usage Reporting: Reports actual allocated resources (CPU, memory, storage) from Kubernetes
  • Complete Lifecycle: Creates groups, binds to projects, manages users, cleans up empty groups
  • Enhanced Descriptions: Project descriptions include customer and project names for clarity

Architecture

The plugin follows the Waldur Site Agent plugin architecture and consists of:

  • RancherBackend: Main backend implementation that orchestrates project and user management
  • RancherClient: Handles Rancher API operations for project management
  • KeycloakClient: Manages Keycloak groups and user memberships

Key Architecture Features

  • Resource-Specific Naming: Rancher projects named after resource slugs for better identification
  • OIDC-Based Access: No direct user-to-Rancher assignments; all access via Keycloak groups
  • Enhanced Backend Interface: Full WaldurResource context available to all backend methods
  • Automatic Cleanup: Empty groups and role bindings automatically removed
  • Real-World Validated: Tested with actual Rancher and Keycloak instances

Installation

  1. Install the plugin using uv:
uv sync --all-packages
  1. The plugin will be automatically discovered via Python entry points.

Setup Requirements

Rancher Server Setup

Required Rancher Credentials

  1. Rancher Server: Accessible Rancher instance
  2. API Access: Unscoped API token with cluster access
  3. Cluster ID: Target cluster ID (format: c-xxxxx, not c-xxxxx:p-xxxxx)

Creating Rancher API Tokens

  1. Login to Rancher UI
  2. Navigate to: User Profile → API & Keys
  3. Create Token:
    • Name: waldur-site-agent
    • Scope: No Scope (unscoped for full access)
    • Expires: Set appropriate expiration
  4. Save: Access Key and Secret Key
  5. Find Cluster ID: In Rancher UI, cluster URL shows cluster ID (e.g., c-j8276)

Keycloak Setup (Optional)

Required for OIDC Group Integration

  1. Keycloak Server: Accessible Keycloak instance
  2. Target Realm: Where user accounts and groups will be managed
  3. Service User: User with group management permissions

Creating Keycloak Service User

  1. Login to Keycloak Admin Console
  2. Select Target Realm: (e.g., your-realm)
  3. Create User:
    • Username: waldur-site-agent-rancher
    • Email Verified: Yes
    • Enabled: Yes
  4. Set Password: In Credentials tab (temporary: No)
  5. Assign Roles: In Role Mappings tab
    • Client Rolesrealm-management
    • Add: manage-users (sufficient for group operations)

Waldur Marketplace Setup

Required Waldur Configuration

  1. Marketplace Offering: Created in Waldur
  2. Components: Configured via waldur_site_load_components
  3. Offering State: Must be Active for order processing

Setting Up Offering Components

  1. Create configuration file with component definitions

  2. Run component loader:

    uv run waldur_site_load_components -c your-config.yaml
    
  3. Activate offering in Waldur Admin UI (change from Draft to Active)

Complete Setup Example

Step 1: Create Configuration File

# rancher-offering-config.yaml
offerings:
  - name: "your-rancher-offering"

    # Waldur API configuration
    waldur_api_url: "https://your-waldur.com/"
    waldur_api_token: "your-waldur-api-token"
    waldur_offering_uuid: "your-offering-uuid"

    # Backend configuration
    backend_type: "rancher"
    order_processing_backend: "rancher"
    membership_sync_backend: "rancher"
    reporting_backend: "rancher"

    backend_settings:
      # Rancher configuration
      backend_url: "https://your-rancher.com"
      username: "token-xxxxx"  # Rancher access key
      password: "your-secret-key"  # Rancher secret key
      cluster_id: "c-xxxxx"  # Cluster ID only
      verify_cert: true
      project_prefix: "waldur-"
      default_role: "workloads-manage"

      # Keycloak integration (optional)
      keycloak_enabled: true
      keycloak_use_user_id: true  # Use Waldur username as Keycloak user ID
      keycloak:
        keycloak_url: "https://your-keycloak.com/"
        keycloak_realm: "your-realm"
        keycloak_user_realm: "your-realm"
        keycloak_username: "waldur-site-agent-rancher"
        keycloak_password: "your-keycloak-password"
        keycloak_ssl_verify: true

    # Component definitions
    backend_components:
      cpu:
        type: "cpu"
        measured_unit: "cores"
        accounting_type: "limit"
        label: "CPU Cores"
        unit_factor: 1
      memory:
        type: "ram"
        measured_unit: "GB"
        accounting_type: "limit"
        label: "Memory (GB)"
        unit_factor: 1
      storage:
        type: "storage"
        measured_unit: "GB"
        accounting_type: "limit"
        label: "Storage (GB)"
        unit_factor: 1

Step 2: Load Components

uv run waldur_site_load_components -c rancher-offering-config.yaml

Step 3: Activate Offering

  1. Login to Waldur Admin UI
  2. Navigate to: Marketplace → Provider Offerings
  3. Find your offering and change state from Draft to Active

Step 4: Start Order Processing

uv run waldur_site_agent -c rancher-offering-config.yaml -m order_process

Step 5: Verify Setup

uv run waldur_site_diagnostics -c rancher-offering-config.yaml

Configuration

Basic Configuration (Rancher only)

waldur:
  api_url: "https://waldur.example.com/api/"
  token: "your-waldur-api-token-here"

offerings:
  - name: "rancher-projects"
    uuid: "12345678-1234-5678-9abc-123456789012"
    backend_type: "rancher"

    backend:
      backend_url: "https://rancher.example.com"
      username: "your-rancher-access-key"
      password: "your-rancher-secret-key"
      cluster_id: "c-m-1234abcd"
      verify_cert: true
      project_prefix: "waldur-"
      default_role: "workloads-manage"
      keycloak_enabled: false

    components:
      cpu:
        type: "cpu"
        name: "CPU"
        measured_unit: "cores"
        billing_type: "fixed"

Full Configuration (with Keycloak)

waldur:
  api_url: "https://waldur.example.com/api/"
  token: "your-waldur-api-token-here"

offerings:
  - name: "rancher-kubernetes"
    uuid: "12345678-1234-5678-9abc-123456789012"
    backend_type: "rancher"

    backend:
      backend_url: "https://rancher.example.com"
      username: "your-rancher-access-key"
      password: "your-rancher-secret-key"
      cluster_id: "c-m-1234abcd"
      verify_cert: true
      project_prefix: "waldur-"
      default_role: "project-member"

      keycloak_enabled: true
      keycloak:
        keycloak_url: "https://keycloak.example.com/auth/"
        keycloak_realm: "waldur"
        keycloak_user_realm: "master"
        keycloak_username: "keycloak-admin"
        keycloak_password: "your-keycloak-admin-password"
        keycloak_ssl_verify: true
        keycloak_sync_frequency: 15

    components:
      cpu:
        type: "cpu"
        name: "CPU"
        measured_unit: "cores"
        billing_type: "fixed"
      memory:
        type: "ram"
        name: "RAM"
        measured_unit: "GB"
        billing_type: "fixed"
      storage:
        type: "storage"
        name: "Storage"
        measured_unit: "GB"
        billing_type: "fixed"
      pods:
        type: "pods"
        name: "Pods"
        measured_unit: "pods"
        billing_type: "fixed"

Configuration Reference

Rancher Settings (matching waldur-mastermind format)

Parameter Type Required Default Description
backend_url string Yes - Rancher server URL (e.g., https://rancher.example.com)
username string Yes - Rancher access key (called username in waldur-mastermind)
password string Yes - Rancher secret key
cluster_id string Yes - Rancher cluster ID (e.g., c-m-1234abcd, not c-m-1234abcd:p-xxxxx)
verify_cert boolean No true Whether to verify SSL certificates
project_prefix string No "waldur-" Prefix for created Rancher project names
default_role string No "workloads-manage" Default role assigned to users in Rancher
keycloak_use_user_id boolean No true Use Keycloak user ID for lookup (false = use username)
namespace_labels map No {} Extra labels applied to namespaces (e.g., gpu-pool: h100-2x)

Keycloak Settings (optional, matching waldur-mastermind format)

Parameter Type Required Default Description
keycloak_enabled boolean No false Enable Keycloak integration
keycloak.keycloak_url string Conditional - Keycloak server URL
keycloak.keycloak_realm string Conditional "waldur" Keycloak realm name
keycloak.keycloak_user_realm string Conditional "master" Keycloak user realm for auth
keycloak.keycloak_username string Conditional - Keycloak admin username
keycloak.keycloak_password string Conditional - Keycloak admin password
keycloak.keycloak_ssl_verify boolean No true Whether to verify SSL certificates

Usage

Running the Agent

Start the agent with your configuration file:

uv run waldur_site_agent -c rancher-config.yaml -m order_process

Diagnostics

Run diagnostics to check connectivity:

uv run waldur_site_diagnostics -c rancher-config.yaml

Supported Agent Modes

  • order_process: Creates and manages Rancher projects based on Waldur resource orders
  • membership_sync: Synchronizes user memberships between Waldur and Rancher/Keycloak
  • report: Reports resource usage from Rancher projects to Waldur

Project Management

Project Creation

When a Waldur resource (representing project access) is created:

  1. A Rancher project is created with the name {project_prefix}{waldur_project_slug}
  2. If Keycloak is enabled, hierarchical groups are created:
    • Parent Group: c_{cluster_uuid_hex} (cluster-level access)
    • Child Group: project_{project_uuid_hex}_{role_name} (project + role access)
  3. Resource quotas are applied to the Rancher project
  4. OIDC binds the Keycloak groups to Rancher project roles

User Management

When users are added to a Waldur resource:

  1. User is added to the Rancher project with the configured role
  2. If Keycloak is enabled, user is added to the child group (project_{project_uuid_hex}_{role_name})
  3. OIDC automatically grants the user access to the Rancher project based on group membership

When users are removed:

  1. User is removed from the Rancher project
  2. If Keycloak is enabled, user is removed from the project role group

Naming Convention

The plugin follows the waldur-mastermind Rancher plugin naming patterns:

  • Rancher Project Name: {project_prefix}{waldur_resource_slug} (configurable prefix)
  • Keycloak Parent Group: c_{cluster_uuid_hex} (cluster access)
  • Keycloak Child Group: project_{project_uuid_hex}_{role_name} (project + role access)

Where:

  • {project_prefix} is configurable (default: waldur-)
  • {waldur_resource_slug} is the Waldur resource slug (more specific than project slug)
  • {cluster_uuid_hex} is the cluster UUID in hex format
  • {project_uuid_hex} is the Waldur project UUID in hex format (for permissions)
  • {role_name} is configurable (default: workloads-manage)

Supported Components and Accounting Model

The plugin supports the following resource components (all with billing_type: "limit"):

  • CPU: Measured in cores
  • Memory: Measured in GB
  • Storage: Measured in GB
  • GPU: Measured in units (optional, see GPU Scheduling)

Accounting Model

Project Limits (Quotas):

  • Only CPU and memory limits are set as Rancher project quotas
  • Storage is not enforced as quotas (reported only)

Usage Reporting (for all components):

All components report actual allocated resources:

  • CPU: Sum of all container CPU requests in the project
  • Memory: Sum of all container memory requests in the project
  • Storage: Sum of all persistent volume claims in the project

Accounting Flow

  1. Project Creation: CPU and memory limits → Rancher project quotas
  2. Usage Reporting: All components → actual allocated resources from Kubernetes

GPU Scheduling

The plugin supports GPU quota management via Kubernetes ResourceQuotas combined with namespace-label-based scheduling. This approach uses a single generic nvidia.com/gpu resource for quotas, and a Kyverno policy on the cluster to steer GPU workloads to the correct node pool based on a namespace label.

How It Works

  1. Site agent creates a namespace with a gpu-pool label (configured via namespace_labels)
  2. Site agent applies a ResourceQuota with requests.nvidia.com/gpu: N (generic GPU count)
  3. Kyverno policy on the cluster reads the namespace's gpu-pool label and injects a matching nodeSelector into every pod, ensuring GPU workloads land on the correct nodes
  4. GPU nodes are labeled with gpu-pool=<type> (e.g., gpu-pool=h100-2x)

Offering Setup

Create a separate Waldur offering per GPU pool type. Each offering's backend config specifies the GPU pool label:

offerings:
  # Offering for H100 GPU pool
  - name: "k8s-gpu-h100"
    uuid: "..."
    backend_type: "rancher"
    backend:
      backend_url: "https://rancher.example.com"
      username: "..."
      password: "..."
      cluster_id: "c-m-1234abcd"
      namespace_labels:
        gpu-pool: "h100-2x"
    components:
      cpu:
        type: "cpu"
        name: "CPU"
        measured_unit: "cores"
      memory:
        type: "ram"
        name: "RAM"
        measured_unit: "GB"
      gpu:
        type: "gpu"
        name: "GPU"
        measured_unit: "units"

  # Offering for H200 GPU pool
  - name: "k8s-gpu-h200"
    uuid: "..."
    backend_type: "rancher"
    backend:
      backend_url: "https://rancher.example.com"
      username: "..."
      password: "..."
      cluster_id: "c-m-1234abcd"
      namespace_labels:
        gpu-pool: "h200-8x"
    components:
      cpu:
        type: "cpu"
        name: "CPU"
        measured_unit: "cores"
      memory:
        type: "ram"
        name: "RAM"
        measured_unit: "GB"
      gpu:
        type: "gpu"
        name: "GPU"
        measured_unit: "units"

Required Kubernetes Setup

1. Label GPU Nodes

Each GPU node must be labeled with its pool type matching the namespace_labels values:

kubectl label node gpu-h100-node-01 gpu-pool=h100-2x
kubectl label node gpu-h100-node-02 gpu-pool=h100-2x
kubectl label node gpu-h200-node-01 gpu-pool=h200-8x

2. Deploy Kyverno Policy

A Kyverno ClusterPolicy must be deployed on the cluster to inject nodeSelector into pods based on the namespace's gpu-pool label. This is the minimal policy:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: enforce-gpu-pool-node-selector
spec:
  rules:
    - name: set-node-selector-from-namespace-label
      match:
        any:
          - resources:
              kinds:
                - Pod
      context:
        - name: gpupool
          apiCall:
            urlPath: /api/v1/namespaces/{{ request.namespace }}
            jmesPath: metadata.labels."gpu-pool" || ''
            method: GET
      preconditions:
        all:
          - key: "{{ gpupool }}"
            operator: NotEquals
            value: ""
      mutate:
        patchStrategicMerge:
          spec:
            nodeSelector:
              gpu-pool: "{{ gpupool }}"

This policy:

  • Matches all Pods in every namespace
  • Fetches the namespace's gpu-pool label via Kubernetes API
  • Precondition: only fires if the label is non-empty (CPU-only namespaces are unaffected)
  • Mutates: injects nodeSelector: {gpu-pool: "<value>"} into the Pod spec
  • Kyverno auto-generates equivalent rules for Deployments, StatefulSets, Jobs, CronJobs, etc.

3. Optional: Default No-GPU Quota Policy

To prevent accidental GPU usage in namespaces without a GPU pool label, deploy a generate policy that creates a zero-GPU ResourceQuota:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: default-no-gpu-quota
spec:
  rules:
    - name: generate-no-gpu-quota
      match:
        any:
          - resources:
              kinds:
                - Namespace
              selector:
                matchExpressions:
                  - key: gpu-pool
                    operator: DoesNotExist
      exclude:
        any:
          - resources:
              namespaces:
                - kube-system
                - kube-public
                - kube-node-lease
                - kyverno
      generate:
        apiVersion: v1
        kind: ResourceQuota
        name: no-gpu-quota
        namespace: "{{request.object.metadata.name}}"
        synchronize: true
        data:
          spec:
            hard:
              requests.nvidia.com/gpu: "0"
              limits.nvidia.com/gpu: "0"

Complete Workflow

The plugin provides end-to-end automation for Rancher project and user management:

Order Processing

  1. Order Detection: Monitors Waldur for new resource orders
  2. Project Creation: Creates Rancher project named {prefix}{resource_slug}
  3. Enhanced Descriptions: Includes customer and project context
  4. Quota Management: Sets CPU and memory limits if specified
  5. OIDC Setup: Creates and binds Keycloak groups to project roles

Membership Sync

  1. User Detection: Monitors Waldur for user membership changes
  2. Group Management: Creates missing Keycloak groups if needed
  3. User Addition: Adds users to appropriate Keycloak groups
  4. User Removal: Removes users when removed from Waldur projects
  5. Cleanup: Removes empty groups and their Rancher role bindings

OIDC Integration Flow

  1. Keycloak Groups: c_{cluster_hex} (parent) → project_{project_slug}_{role} (child)
  2. Group Binding: keycloakoidc_group://{group_name} bound to Rancher project role
  3. User Management: Users added to Keycloak groups only (not directly to Rancher)
  4. Automatic Access: OIDC grants Rancher project access based on group membership

Error Handling

  • Rancher connectivity issues will be logged and retried
  • Keycloak failures will be logged but won't stop Rancher operations
  • Invalid configurations will be detected during diagnostics
  • Missing users in Keycloak will be logged as warnings

Development

Running Tests

uv run pytest plugins/rancher/tests/

Code Quality

pre-commit run --all-files

Troubleshooting

Common Issues

1. Order Processing Disabled

Order processing is disabled for offering X, skipping it

Solution: Add backend configuration to your offering:

order_processing_backend: "rancher"
membership_sync_backend: "rancher"
reporting_backend: "rancher"

2. Rancher Authentication Fails (401 Unauthorized)

401 Client Error: Unauthorized for url: https://rancher.../v3

Solutions:

  • Verify access key and secret key are correct
  • Ensure token is unscoped (not cluster-specific)
  • Check token hasn't expired
  • Verify API URL format: https://your-rancher.com (without /v3)

3. Keycloak Connection Fails (404)

404: "Unable to find matching target resource method"

Solutions:

  • Verify Keycloak URL (try with/without /auth/ suffix)
  • Check realm name is correct
  • Ensure user exists in the specified realm

4. Keycloak Group Creation Fails (403 Forbidden)

403: "HTTP 403 Forbidden"

Solution: Grant user manage-users role:

  • Realm: Select target realm
  • Users → Your service user
  • Role MappingsClient Rolesrealm-management
  • Add: manage-users

5. Cluster ID Format Error

Cluster not found or invalid cluster ID

Solution: Use correct format:

  • Correct: c-j8276 (cluster ID only)
  • Incorrect: c-j8276:p-xxxxx (project reference)

6. Component Loading Fails

KeyError: 'accounting_type'

Solution: Use correct component format:

backend_components:
  cpu:
    type: "cpu"
    measured_unit: "cores"
    accounting_type: "limit"  # Not billing_type
    label: "CPU Cores"
    unit_factor: 1

Logging

Enable debug logging to see detailed operation logs:

logging:
  level: DEBUG

Diagnostic Commands

Run comprehensive diagnostics:

uv run waldur_site_diagnostics -c your-config.yaml

This will test:

  • Rancher API connectivity and authentication
  • Keycloak connectivity and permissions (if enabled)
  • Project listing capabilities
  • Backend discovery and initialization
  • Component configuration validity

Verification Commands

Test individual components:

# Test Rancher connection
curl -u "token-xxxxx:secret-key" "https://your-rancher.com/v3"

# Test Keycloak realm access
curl "https://your-keycloak.com/auth/admin/realms/your-realm" \
  -H "Authorization: Bearer $(get-keycloak-token)"

# List Rancher projects in cluster
curl -u "token-xxxxx:secret-key" \
  "https://your-rancher.com/v3/projects?clusterId=c-xxxxx"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

waldur_site_agent_rancher-1.0.3.tar.gz (30.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

waldur_site_agent_rancher-1.0.3-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file waldur_site_agent_rancher-1.0.3.tar.gz.

File metadata

  • Download URL: waldur_site_agent_rancher-1.0.3.tar.gz
  • Upload date:
  • Size: 30.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.0 {"installer":{"name":"uv","version":"0.11.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Alpine Linux","version":"3.23.3","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for waldur_site_agent_rancher-1.0.3.tar.gz
Algorithm Hash digest
SHA256 87bfe074fa5a1f2711d2cdd6d8dfaf9ad079e7ef4b2a35a66d494803d25020f9
MD5 6394a4d93d20721fec2435b1c678bd52
BLAKE2b-256 893fbf3d165f534b0234f4eb3b3a21de264f37bf8c686f2ff5f89fe99d690259

See more details on using hashes here.

File details

Details for the file waldur_site_agent_rancher-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: waldur_site_agent_rancher-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.0 {"installer":{"name":"uv","version":"0.11.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Alpine Linux","version":"3.23.3","id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for waldur_site_agent_rancher-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 b8eeeddeed688f13e3b52df1cecacc5c092737190dc3bd590b23d679b0dad93f
MD5 8f85bc1ffd8eb6d131259acbf5bd592c
BLAKE2b-256 8204baa5f4ec784c95c3c77d1328364cf1f7e40d621ee05c46165518a50a2e68

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page