A Pulumi provider for managing DevZero infrastructure resources.
Project description
pulumi-provider-devzero
The official Pulumi provider for DevZero, enabling you to manage DevZero infrastructure — Clusters, Workload Policies, and Node Policies — using your preferred programming language.
Resources
| Resource | Description |
|---|---|
Cluster |
Provision and manage a DevZero cluster |
WorkloadPolicy |
Configure vertical/horizontal scaling policies for workloads |
WorkloadPolicyTarget |
Apply a workload policy to one or more clusters with filters |
WorkloadRule |
Pin explicit resource rules to a specific workload (MPA v3) |
NodePolicy |
Configure node provisioning and pooling (AWS / Azure) |
NodePolicyTarget |
Apply a node policy to one or more clusters |
Prerequisites
- Pulumi CLI v3+
- A DevZero account and API token
- The provider binary in
bin/(see Building from source)
Installation
TypeScript / JavaScript
npm install @devzero/pulumi-devzero
Python
pip install pulumi-devzero
Go
go get github.com/devzero-inc/pulumi-provider-devzero/sdk/go/devzero
Configuration
Set your DevZero API endpoint and credentials via Pulumi config or environment variables:
pulumi config set --secret devzero:token <YOUR_PAT_TOKEN>
pulumi config set devzero:teamId <TEAM_ID>
pulumi config set devzero:url https://dakr.devzero.io # optional, this is the default
Quick Start
Pick your language below. Each example creates a Cluster, a WorkloadPolicy with CPU/memory vertical scaling, a WorkloadPolicyTarget, a NodePolicy, and a NodePolicyTarget.
TypeScript
Setup
mkdir my-devzero-ts && cd my-devzero-ts
pulumi new typescript
npm install @devzero/pulumi-devzero
index.ts
import * as pulumi from "@pulumi/pulumi";
import { resources } from "@devzero/pulumi-devzero";
// 1. Create a cluster
const cluster = new resources.Cluster("prod-cluster", {
name: "prod-cluster",
});
// 2. Create a workload policy with CPU and memory vertical scaling
const policy = new resources.WorkloadPolicy("cpu-scaling-policy", {
name: "cpu-scaling-policy",
description: "Policy with CPU and memory vertical scaling enabled",
actionTriggers: ["on_detection", "on_schedule"], // apply on pod events AND on schedule
cronSchedule: "0 2 * * *", // daily at 2 am UTC (required for on_schedule)
detectionTriggers: ["pod_creation", "pod_reschedule"],
cpuVerticalScaling: {
enabled: true,
targetPercentile: 0.75, // P75 of observed usage
minRequest: 25, // millicores; hard floor
maxScaleUpPercent: 1000, // % per step
maxScaleDownPercent: 1, // % per step
minDataPoints: 20, // min CPU samples
adjustReqEvenIfNotSet: true, // set requests even if workload has none
limitsRemovalEnabled: true, // strip CPU limits (cycles compress safely)
},
memoryVerticalScaling: {
enabled: true,
targetPercentile: 1, // P100 — guard against OOMKills
minRequest: 134217728, // 128 MiB in bytes; hard floor
maxScaleUpPercent: 1000, // % per step
maxScaleDownPercent: 1, // % per step
overheadMultiplier: 0.3, // extra headroom over the recommendation
limitsAdjustmentEnabled: true, // adjust limits alongside requests
limitMultiplier: 1, // limits = request × this
minDataPoints: 20, // min memory samples
adjustReqEvenIfNotSet: true, // set requests even if workload has none
},
enablePmaxProtection: true, // guard against spike-induced OOMKills
pmaxRatioThreshold: 3, // raise requests 3× on an OOM event
minChangePercent: 0.2, // apply only if change > 20%
});
// 3. Apply the workload policy to the cluster for all Deployments
const workloadTarget = new resources.WorkloadPolicyTarget("prod-cluster-deployments-target", {
name: "prod-cluster-deployments-target",
description: "Apply cpu-scaling-policy to all Deployments in prod-cluster",
policyId: policy.id,
clusterIds: [cluster.id],
kindFilter: ["Deployment"],
enabled: true,
});
// 4. Create a node policy for dzkarp-based node provisioning
const nodePolicy = new resources.NodePolicy("prod-node-policy", {
name: "prod-node-policy",
description: "Cost-efficient node provisioning for production workloads",
// Higher weight wins when multiple policies match the same node request.
weight: 10,
// Instance categories: c (compute), m (general), r (memory), t (burstable).
// Kept broad to maximise the instance pool and minimise cost.
instanceCategories: {
matchExpressions: [{
operator: "In",
values: ["c", "m", "r", "t"],
}],
},
// Instance generation: prefer modern hardware (gen 3+) for better performance/cost ratio.
instanceGenerations: {
matchExpressions: [{
operator: "In",
values: ["3", "4", "5", "6"],
}],
},
// CPU architecture: amd64 (x86_64) — derived from active nodes in the cluster.
architectures: {
matchExpressions: [{ operator: "In", values: ["amd64"] }],
},
// Capacity types: prefer spot for savings, fall back to on-demand for availability.
capacityTypes: {
matchExpressions: [{ operator: "In", values: ["spot", "on-demand"] }],
},
// Operating system: linux only.
operatingSystems: {
matchExpressions: [{ operator: "In", values: ["linux"] }],
},
// Disruption: how dzkarp consolidates and rotates nodes.
disruption: {
consolidationPolicy: "WhenEmptyOrUnderutilized", // reclaim empty and underused nodes
consolidateAfter: "2h0m0s", // wait 2 h before consolidating
expireAfter: "168h", // rotate nodes after 7 days
budgets: [
{
// Disrupt up to 10% of nodes at once for these reasons.
reasons: ["Empty", "Drifted", "Underutilized"],
nodes: "10%",
},
{
// Always protect at least 1 node from disruption at any time.
nodes: "1",
},
],
},
// Override the generated dzkarp CRD names (helps avoid collisions in shared clusters).
nodePoolName: "prod-nodepool", // name of the dzkarp NodePool CR
nodeClassName: "prod-nodeclass", // name of the dzkarp NodeClass CR
// AWS-specific EC2 configuration.
aws: {
// Subnets where nodes will launch — discovered via the cluster tag.
subnetSelectorTerms: [{
tags: { "karpenter.sh/discovery": "my-prod-cluster" },
}],
// Security groups for node instances — same discovery tag pattern.
securityGroupSelectorTerms: [{
tags: { "karpenter.sh/discovery": "my-prod-cluster" },
}],
// AMI: latest Amazon Linux 2023 managed alias (dzkarp keeps it up to date).
amiSelectorTerms: [{ alias: "al2023@latest" }],
// IAM role dzkarp uses to launch and manage nodes (must already exist in AWS).
role: "KarpenterNodeRole-my-prod-cluster",
},
});
// 5. Attach the node policy to the cluster
const nodePolicyTarget = new resources.NodePolicyTarget("prod-node-policy-target", {
name: "prod-node-policy-target",
description: "Apply prod-node-policy to prod-cluster",
policyId: nodePolicy.id,
clusterIds: [cluster.id],
enabled: true,
});
export const clusterId = cluster.id;
export const clusterToken = pulumi.secret(cluster.token);
export const policyId = policy.id;
export const targetId = workloadTarget.id;
export const nodePolicyId = nodePolicy.id;
Deploy
npm run build
pulumi up
Python
Setup
mkdir my-devzero-py && cd my-devzero-py
pulumi new python
pip install pulumi-devzero
__main__.py
import pulumi
from pulumi_devzero.resources import (
Cluster, ClusterArgs,
WorkloadPolicy, WorkloadPolicyArgs,
WorkloadPolicyTarget, WorkloadPolicyTargetArgs,
NodePolicy, NodePolicyArgs,
NodePolicyTarget, NodePolicyTargetArgs,
)
from pulumi_devzero.resources.types import (
VerticalScalingArgs,
LabelSelectorArgs,
MatchExpressionArgs,
DisruptionPolicyArgs,
DisruptionBudgetArgs,
AWSNodeClassSpecArgs,
AMISelectorTermArgs,
SubnetSelectorTermArgs,
SecurityGroupSelectorTermArgs,
)
# 1. Create a cluster
cluster = Cluster(
"prod-cluster",
args=ClusterArgs(name="prod-cluster"),
)
# 2. Create a workload policy with CPU and memory vertical scaling
policy = WorkloadPolicy(
"cpu-scaling-policy",
args=WorkloadPolicyArgs(
name="cpu-scaling-policy",
description="Policy with CPU and memory vertical scaling enabled",
action_triggers=["on_detection", "on_schedule"], # apply on pod events AND on schedule
cron_schedule="0 2 * * *", # daily at 2 am UTC (required for on_schedule)
detection_triggers=["pod_creation", "pod_reschedule"],
cpu_vertical_scaling=VerticalScalingArgs(
enabled=True,
target_percentile=0.75, # P75 of observed usage
min_request=25, # millicores; hard floor
max_scale_up_percent=1000, # % per step
max_scale_down_percent=1, # % per step
min_data_points=20, # min CPU samples
adjust_req_even_if_not_set=True, # set requests even if workload has none
limits_removal_enabled=True, # strip CPU limits (cycles compress safely)
),
memory_vertical_scaling=VerticalScalingArgs(
enabled=True,
target_percentile=1, # P100 — guard against OOMKills
min_request=134217728, # 128 MiB in bytes; hard floor
max_scale_up_percent=1000, # % per step
max_scale_down_percent=1, # % per step
overhead_multiplier=0.3, # extra headroom over the recommendation
limits_adjustment_enabled=True, # adjust limits alongside requests
limit_multiplier=1, # limits = request × this
min_data_points=20, # min memory samples
adjust_req_even_if_not_set=True, # set requests even if workload has none
),
enable_pmax_protection=True, # guard against spike-induced OOMKills
pmax_ratio_threshold=3, # raise requests 3× on an OOM event
min_change_percent=0.2, # apply only if change > 20%
),
)
# 3. Apply the workload policy to the cluster for all Deployments
workload_target = WorkloadPolicyTarget(
"prod-cluster-deployments-target",
args=WorkloadPolicyTargetArgs(
name="prod-cluster-deployments-target",
policy_id=policy.id,
cluster_ids=[cluster.id],
kind_filter=["Deployment"],
enabled=True,
),
)
# 4. Create a node policy for dzkarp-based node provisioning
node_policy = NodePolicy("prod-node-policy", args=NodePolicyArgs(
name="prod-node-policy",
description="Cost-efficient node provisioning for production workloads",
# Higher weight wins when multiple policies match the same node request.
weight=10,
# Instance categories: c (compute), m (general), r (memory), t (burstable).
# Kept broad to maximise the instance pool and minimise cost.
instance_categories=LabelSelectorArgs(
match_expressions=[MatchExpressionArgs(
operator="In",
values=["c", "m", "r", "t"],
)],
),
# Instance generation: prefer modern hardware (gen 3+) for better performance/cost ratio.
instance_generations=LabelSelectorArgs(
match_expressions=[MatchExpressionArgs(
operator="In",
values=["3", "4", "5", "6"],
)],
),
# CPU architecture: amd64 (x86_64) — derived from active nodes in the cluster.
architectures=LabelSelectorArgs(
match_expressions=[MatchExpressionArgs(operator="In", values=["amd64"])],
),
# Capacity types: prefer spot for savings, fall back to on-demand for availability.
capacity_types=LabelSelectorArgs(
match_expressions=[MatchExpressionArgs(operator="In", values=["spot", "on-demand"])],
),
# Operating system: linux only.
operating_systems=LabelSelectorArgs(
match_expressions=[MatchExpressionArgs(operator="In", values=["linux"])],
),
# Disruption: how dzkarp consolidates and rotates nodes.
disruption=DisruptionPolicyArgs(
consolidation_policy="WhenEmptyOrUnderutilized", # reclaim empty and underused nodes
consolidate_after="2h0m0s", # wait 2 h before consolidating
expire_after="168h", # rotate nodes after 7 days
budgets=[
DisruptionBudgetArgs(
# Disrupt up to 10% of nodes at once for these reasons.
reasons=["Empty", "Drifted", "Underutilized"],
nodes="10%",
),
DisruptionBudgetArgs(nodes="1"), # always protect at least 1 node
],
),
# Override the generated dzkarp CRD names (helps avoid collisions in shared clusters).
node_pool_name="prod-nodepool", # name of the dzkarp NodePool CR
node_class_name="prod-nodeclass", # name of the dzkarp NodeClass CR
# AWS-specific EC2 configuration.
aws=AWSNodeClassSpecArgs(
# Subnets where nodes will launch — discovered via the cluster tag.
subnet_selector_terms=[SubnetSelectorTermArgs(
tags={"karpenter.sh/discovery": "my-prod-cluster"},
)],
# Security groups for node instances — same discovery tag pattern.
security_group_selector_terms=[SecurityGroupSelectorTermArgs(
tags={"karpenter.sh/discovery": "my-prod-cluster"},
)],
# AMI: latest Amazon Linux 2023 managed alias (dzkarp keeps it up to date).
ami_selector_terms=[AMISelectorTermArgs(alias="al2023@latest")],
# IAM role dzkarp uses to launch and manage nodes (must already exist in AWS).
role="KarpenterNodeRole-my-prod-cluster",
),
))
# 5. Attach the node policy to the cluster
node_policy_target = NodePolicyTarget("prod-node-policy-target", args=NodePolicyTargetArgs(
name="prod-node-policy-target",
description="Apply prod-node-policy to prod-cluster",
policy_id=node_policy.id,
cluster_ids=[cluster.id],
enabled=True,
))
pulumi.export("cluster_id", cluster.id)
pulumi.export("cluster_token", pulumi.Output.secret(cluster.token))
pulumi.export("policy_id", policy.id)
pulumi.export("target_id", workload_target.id)
pulumi.export("node_policy_id", node_policy.id)
Deploy
pulumi up
Go
Setup
mkdir my-devzero-go && cd my-devzero-go
pulumi new go
go get github.com/devzero-inc/pulumi-provider-devzero/sdk/go/devzero
main.go
package main
import (
"github.com/devzero-inc/pulumi-provider-devzero/sdk/go/devzero/resources"
"github.com/pulumi/pulumi/sdk/v3/go/pulumi"
)
func main() {
pulumi.Run(func(ctx *pulumi.Context) error {
// 1. Create a cluster
cluster, err := resources.NewCluster(ctx, "prod-cluster", &resources.ClusterArgs{
Name: pulumi.String("prod-cluster"),
})
if err != nil {
return err
}
// 2. Create a workload policy with CPU and memory vertical scaling
policy, err := resources.NewWorkloadPolicy(ctx, "cpu-scaling-policy", &resources.WorkloadPolicyArgs{
Name: pulumi.String("cpu-scaling-policy"),
Description: pulumi.StringPtr("Policy with CPU and memory vertical scaling enabled"),
ActionTriggers: pulumi.StringArray{pulumi.String("on_detection"), pulumi.String("on_schedule")},
CronSchedule: pulumi.StringPtr("0 2 * * *"), // daily at 2 am UTC (required for on_schedule)
DetectionTriggers: pulumi.StringArray{pulumi.String("pod_creation"), pulumi.String("pod_reschedule")},
CpuVerticalScaling: resources.VerticalScalingArgsArgs{
Enabled: pulumi.BoolPtr(true),
TargetPercentile: pulumi.Float64Ptr(0.75), // P75 of observed usage
MinRequest: pulumi.IntPtr(25), // millicores; hard floor
MaxScaleUpPercent: pulumi.Float64Ptr(1000), // % per step
MaxScaleDownPercent: pulumi.Float64Ptr(1), // % per step
MinDataPoints: pulumi.IntPtr(20), // min CPU samples
AdjustReqEvenIfNotSet: pulumi.BoolPtr(true), // set requests even if workload has none
LimitsRemovalEnabled: pulumi.BoolPtr(true), // strip CPU limits (cycles compress safely)
}.ToVerticalScalingArgsPtrOutput(),
MemoryVerticalScaling: resources.VerticalScalingArgsArgs{
Enabled: pulumi.BoolPtr(true),
TargetPercentile: pulumi.Float64Ptr(1.0), // P100 — guard against OOMKills
MinRequest: pulumi.IntPtr(134217728), // 128 MiB in bytes; hard floor
MaxScaleUpPercent: pulumi.Float64Ptr(1000), // % per step
MaxScaleDownPercent: pulumi.Float64Ptr(1), // % per step
OverheadMultiplier: pulumi.Float64Ptr(0.3), // extra headroom over the recommendation
LimitsAdjustmentEnabled: pulumi.BoolPtr(true), // adjust limits alongside requests
LimitMultiplier: pulumi.Float64Ptr(1), // limits = request × this
MinDataPoints: pulumi.IntPtr(20), // min memory samples
AdjustReqEvenIfNotSet: pulumi.BoolPtr(true), // set requests even if workload has none
}.ToVerticalScalingArgsPtrOutput(),
EnablePmaxProtection: pulumi.BoolPtr(true), // guard against spike-induced OOMKills
PmaxRatioThreshold: pulumi.Float64Ptr(3), // raise requests 3× on an OOM event
MinChangePercent: pulumi.Float64Ptr(0.2), // apply only if change > 20%
})
if err != nil {
return err
}
// 3. Apply the workload policy to the cluster for all Deployments
workloadTarget, err := resources.NewWorkloadPolicyTarget(ctx, "prod-cluster-deployments-target", &resources.WorkloadPolicyTargetArgs{
Name: pulumi.String("prod-cluster-deployments-target"),
PolicyId: policy.ID(),
ClusterIds: pulumi.StringArray{cluster.ID()},
KindFilter: pulumi.StringArray{pulumi.String("Deployment")},
Enabled: pulumi.BoolPtr(true),
})
if err != nil {
return err
}
// 4. Create a node policy for dzkarp-based node provisioning
nodePolicy, err := resources.NewNodePolicy(ctx, "prod-node-policy", &resources.NodePolicyArgs{
Name: pulumi.String("prod-node-policy"),
Description: pulumi.StringPtr("Cost-efficient node provisioning for production workloads"),
// Higher weight wins when multiple policies match the same node request.
Weight: pulumi.IntPtr(10),
// Instance categories: c (compute), m (general), r (memory), t (burstable).
// Kept broad to maximise the instance pool and minimise cost.
InstanceCategories: &resources.LabelSelectorArgs{
MatchExpressions: resources.MatchExpressionArray{
{Operator: pulumi.String("In"),
Values: pulumi.StringArray{pulumi.String("c"), pulumi.String("m"), pulumi.String("r"), pulumi.String("t")}},
},
},
// Instance generation: prefer modern hardware (gen 3+) for better performance/cost ratio.
InstanceGenerations: &resources.LabelSelectorArgs{
MatchExpressions: resources.MatchExpressionArray{
{Operator: pulumi.String("In"),
Values: pulumi.StringArray{pulumi.String("3"), pulumi.String("4"), pulumi.String("5"), pulumi.String("6")}},
},
},
// CPU architecture: amd64 (x86_64) — derived from active nodes in the cluster.
Architectures: &resources.LabelSelectorArgs{
MatchExpressions: resources.MatchExpressionArray{
{Operator: pulumi.String("In"), Values: pulumi.StringArray{pulumi.String("amd64")}},
},
},
// Capacity types: prefer spot for savings, fall back to on-demand for availability.
CapacityTypes: &resources.LabelSelectorArgs{
MatchExpressions: resources.MatchExpressionArray{
{Operator: pulumi.String("In"), Values: pulumi.StringArray{pulumi.String("spot"), pulumi.String("on-demand")}},
},
},
// Operating system: linux only.
OperatingSystems: &resources.LabelSelectorArgs{
MatchExpressions: resources.MatchExpressionArray{
{Operator: pulumi.String("In"), Values: pulumi.StringArray{pulumi.String("linux")}},
},
},
// Disruption: how dzkarp consolidates and rotates nodes.
Disruption: &resources.DisruptionPolicyArgs{
ConsolidationPolicy: pulumi.StringPtr("WhenEmptyOrUnderutilized"), // reclaim empty and underused nodes
ConsolidateAfter: pulumi.StringPtr("2h0m0s"), // wait 2 h before consolidating
ExpireAfter: pulumi.StringPtr("168h"), // rotate nodes after 7 days
Budgets: resources.DisruptionBudgetArray{
{
// Disrupt up to 10% of nodes at once for these reasons.
Reasons: pulumi.StringArray{pulumi.String("Empty"), pulumi.String("Drifted"), pulumi.String("Underutilized")},
Nodes: pulumi.StringPtr("10%"),
},
{
Nodes: pulumi.StringPtr("1"), // always protect at least 1 node
},
},
},
// Override the generated dzkarp CRD names (helps avoid collisions in shared clusters).
NodePoolName: pulumi.StringPtr("prod-nodepool"), // name of the dzkarp NodePool CR
NodeClassName: pulumi.StringPtr("prod-nodeclass"), // name of the dzkarp NodeClass CR
// AWS-specific EC2 configuration.
Aws: &resources.AWSNodeClassSpecArgs{
// Subnets where nodes will launch — discovered via the cluster tag.
SubnetSelectorTerms: resources.SubnetSelectorTermArray{
{Tags: pulumi.StringMap{"karpenter.sh/discovery": pulumi.String("my-prod-cluster")}},
},
// Security groups for node instances — same discovery tag pattern.
SecurityGroupSelectorTerms: resources.SecurityGroupSelectorTermArray{
{Tags: pulumi.StringMap{"karpenter.sh/discovery": pulumi.String("my-prod-cluster")}},
},
// AMI: latest Amazon Linux 2023 managed alias (dzkarp keeps it up to date).
AmiSelectorTerms: resources.AMISelectorTermArray{
{Alias: pulumi.StringPtr("al2023@latest")},
},
// IAM role dzkarp uses to launch and manage nodes (must already exist in AWS).
Role: pulumi.StringPtr("KarpenterNodeRole-my-prod-cluster"),
},
})
if err != nil {
return err
}
// 5. Attach the node policy to the cluster
_, err = resources.NewNodePolicyTarget(ctx, "prod-node-policy-target", &resources.NodePolicyTargetArgs{
Name: pulumi.String("prod-node-policy-target"),
Description: pulumi.StringPtr("Apply prod-node-policy to prod-cluster"),
PolicyId: nodePolicy.ID(),
ClusterIds: pulumi.StringArray{cluster.ID()},
Enabled: pulumi.BoolPtr(true),
})
if err != nil {
return err
}
ctx.Export("clusterId", cluster.ID())
ctx.Export("clusterToken", cluster.Token)
ctx.Export("policyId", policy.ID())
ctx.Export("targetId", workloadTarget.ID())
ctx.Export("nodePolicyId", nodePolicy.ID())
return nil
})
}
Deploy
go build -o devzero-example .
pulumi up
Data Sources
getClusterIdByName
Look up an existing cluster by name and return its ID. Use this when a cluster was registered manually (not created by Pulumi) and you need its ID to attach policies, inject into values.yaml, or pass to a Kubernetes secret.
Note: If multiple clusters share the same name, the newest one (by
created_at) is returned by default. Use thelivenessfield to prefer or require a cluster whose zxporter agent has reported a heartbeat within the last 60 minutes.
TypeScript
import { resources } from "@devzero/pulumi-devzero";
const existing = await resources.getClusterIdByName({
name: "my-existing-cluster",
// teamId: "my-team-id", // optional — defaults to devzero:teamId from provider config
// region: "us-east-1", // optional: filter by region
// cloudProvider: "AWS", // optional: filter by cloud provider (AWS | GCP | AKS | OCI)
// liveness: "PREFER_LIVE", // optional: IGNORE | PREFER_LIVE | REQUIRE_LIVE
});
const target = new resources.WorkloadPolicyTarget("my-target", {
name: "my-target",
policyId: policy.id,
clusterIds: [existing.clusterId],
kindFilter: ["Deployment"],
enabled: true,
});
export const existingClusterId = existing.clusterId;
Python
import pulumi
import pulumi_devzero as devzero
existing = devzero.resources.get_cluster_id_by_name(
name="my-existing-cluster",
# team_id="my-team-id", # optional — defaults to devzero:teamId from provider config
# region="us-east-1", # optional: filter by region
# cloud_provider="AWS", # optional: filter by cloud provider (AWS | GCP | AKS | OCI)
# liveness="PREFER_LIVE", # optional: IGNORE | PREFER_LIVE | REQUIRE_LIVE
)
target = devzero.resources.WorkloadPolicyTarget("my-target",
name="my-target",
policy_id=policy.id,
cluster_ids=[existing.cluster_id],
kind_filter=["Deployment"],
enabled=True,
)
pulumi.export("existing_cluster_id", existing.cluster_id)
Go
existing, err := resources.GetClusterIdByName(ctx, &resources.GetClusterIdByNameArgs{
Name: "my-existing-cluster",
// TeamId: pulumi.StringRef("my-team-id"), // optional — defaults to devzero:teamId from provider config
// Region: pulumi.StringRef("us-east-1"), // optional: filter by region
// CloudProvider: pulumi.StringRef("AWS"), // optional: filter by cloud provider (AWS | GCP | AKS | OCI)
// Liveness: pulumi.StringRef("PREFER_LIVE"), // optional: IGNORE | PREFER_LIVE | REQUIRE_LIVE
})
if err != nil {
return err
}
_, err = resources.NewWorkloadPolicyTarget(ctx, "my-target", &resources.WorkloadPolicyTargetArgs{
Name: pulumi.String("my-target"),
PolicyId: policy.ID(),
ClusterIds: pulumi.StringArray{pulumi.String(existing.ClusterId)},
KindFilter: pulumi.StringArray{pulumi.String("Deployment")},
Enabled: pulumi.BoolPtr(true),
})
if err != nil {
return err
}
ctx.Export("existingClusterId", pulumi.String(existing.ClusterId))
Inputs:
| Field | Type | Required | Description |
|---|---|---|---|
name |
string | yes | Cluster name to look up |
teamId |
string | no | Team to search within. Defaults to devzero:teamId from provider config |
region |
string | no | Filter by region name (e.g. us-east-1) |
cloudProvider |
string | no | Filter by cloud provider: AWS, GCP, AKS, OCI |
liveness |
string | no | Heartbeat filter: IGNORE (default), PREFER_LIVE, REQUIRE_LIVE |
Outputs:
| Field | Type | Description |
|---|---|---|
clusterId |
string | UUID of the matching cluster |
WorkloadPolicy — Key Fields
| Field | Type | Description |
|---|---|---|
name |
string | Unique policy name (per team) |
description |
string | Human-readable description |
cpuVerticalScaling |
VerticalScalingArgs |
CPU vertical scaling configuration |
memoryVerticalScaling |
VerticalScalingArgs |
Memory vertical scaling configuration |
gpuVerticalScaling |
VerticalScalingArgs |
GPU core vertical scaling configuration (units: GPU millicores) |
gpuVramVerticalScaling |
VerticalScalingArgs |
GPU VRAM vertical scaling configuration (units: bytes) |
horizontalScaling |
HorizontalScalingArgs |
Horizontal (replica) scaling configuration |
actionTriggers |
string[] | When to apply recommendations: on_detection | on_schedule. Both can be used together. |
cronSchedule |
string | 5-field UTC cron expression for scheduled application. Required when actionTriggers includes on_schedule. Example: 0 2 * * * |
detectionTriggers |
string[] | Events that trigger a recommendation: pod_creation | pod_update | pod_reschedule |
loopbackPeriodSeconds |
int | Seconds of historical usage data to consider. Default: 86400 (24 h) |
startupPeriodSeconds |
int | Seconds after workload start to exclude from usage data (avoids cold-start spikes). Example: 300 |
liveMigrationEnabled |
bool | Allow live pod migration when applying recommendations without restart. Default: false |
schedulerPlugins |
string[] | Kubernetes scheduler plugins to activate. Example: ["binpacking"] |
defragmentationSchedule |
string | Cron expression for background node defragmentation. Example: 0 3 * * 0 |
enablePmaxProtection |
bool | Raise requests to cover peak usage when max/recommendation ratio exceeds pmaxRatioThreshold. Default: false |
pmaxRatioThreshold |
float | Peak-to-recommendation ratio that triggers pmax protection. Default: 3.0 |
minDataPoints |
int | Global minimum data points required before a recommendation is emitted. Default: 15 |
minChangePercent |
float | Global minimum relative change (0–1) required before applying a recommendation. Default: 0.2 (20%) |
stabilityCvMax |
float | Maximum coefficient of variation (stddev/mean) for a workload to be considered stable enough for VPA. Example: 0.3 |
hysteresisVsTarget |
float | Dead-band ratio around the HPA target to suppress VPA/HPA oscillation. Example: 0.1 |
driftDeltaPercent |
float | Percentage change from baseline recommendation that triggers a VPA refresh. Example: 20.0 |
minVpaWindowDataPoints |
int | Minimum data points in VPA analysis window. Default: 30 |
cooldownMinutes |
int | Minutes to wait between applying recommendations. Default: 300 (5 h) |
Python uses snake_case for all fields (e.g. cpu_vertical_scaling, action_triggers, cron_schedule, detection_triggers, enable_pmax_protection, loopback_period_seconds, min_data_points, min_change_percent, cooldown_minutes). Go uses PascalCase equivalents.
VerticalScalingArgs
| Field | Type | Description |
|---|---|---|
enabled |
bool | Enable this scaling axis |
targetPercentile |
float | Percentile of observed usage to target (e.g. 0.95) |
minRequest |
int | Minimum resource request (millicores for CPU, bytes for memory) |
maxRequest |
int | Maximum resource request |
maxScaleUpPercent |
float | Maximum percentage to scale up in one step. Default: 1000 |
maxScaleDownPercent |
float | Maximum percentage to scale down in one step. Default: 1.0 |
overheadMultiplier |
float | Safety margin multiplier applied on top of the recommendation |
limitsAdjustmentEnabled |
bool | Whether to also adjust resource limits alongside requests |
limitMultiplier |
float | Limits = request × limitMultiplier |
minDataPoints |
int | Minimum data points required before a recommendation is emitted. Default: 20 |
adjustReqEvenIfNotSet |
bool | Recommend requests even when the workload has no existing requests set. Default: false |
limitsRemovalEnabled |
bool | Actively remove limits from workloads (CPU axis only — memory limits removal is not supported). Takes precedence over limitsAdjustmentEnabled. Default: false |
Python: target_percentile, min_request, max_request, max_scale_up_percent, max_scale_down_percent, overhead_multiplier, limits_adjustment_enabled, limit_multiplier, min_data_points, adjust_req_even_if_not_set, limits_removal_enabled. Go uses PascalCase equivalents.
HorizontalScalingArgs
| Field | Type | Description |
|---|---|---|
enabled |
bool | Enable horizontal (replica) scaling |
minReplicas |
int | Minimum number of replicas to maintain |
maxReplicas |
int | Maximum number of replicas to scale to |
targetUtilization |
float | Target utilization ratio (0–1) for the primary metric. Example: 0.8 |
primaryMetric |
string | Metric driving HPA: cpu | memory | gpu | network_ingress | network_egress. Example: "memory" |
minDataPoints |
int | Minimum data points before a recommendation is emitted |
maxReplicaChangePercent |
float | Maximum fraction of current replicas that can change per cycle (0–1). 0.25 = at most 25% added/removed at once. Example: 0.25 |
Python: min_replicas, max_replicas, target_utilization, primary_metric, min_data_points, max_replica_change_percent. Go uses PascalCase equivalents.
WorkloadPolicyTarget — Key Fields
| Field | Type | Description |
|---|---|---|
name |
string | Unique target name |
policyId |
string | ID of the WorkloadPolicy to apply |
clusterIds |
string[] | IDs of clusters to target |
description |
string | Human-readable description (optional) |
priority |
int | Evaluation priority; higher value wins when targets overlap |
kindFilter |
string[] | Workload kinds: Pod | Deployment | StatefulSet | DaemonSet | Job | CronJob | ReplicaSet | ReplicationController | Rollout |
workloadNames |
string[] | Explicit list of workload names to include |
nodeGroupNames |
string[] | Restrict matching to specific node groups by name |
namePattern |
NamePatternArgs |
Regex pattern to match workload names |
namespaceSelector |
LabelSelectorArgs |
Select namespaces by labels (matchLabels / matchExpressions) |
workloadSelector |
LabelSelectorArgs |
Select workloads by labels |
enabled |
bool | Activate the target. Default: true |
Python: policy_id, cluster_ids, kind_filter, workload_names, node_group_names, name_pattern, namespace_selector, workload_selector. Go uses PascalCase equivalents.
NodePolicy — Key Fields
NodePolicy configures dzkarp-based node provisioning rules. Ensure dzkarp is installed on your target clusters before attaching node policies.
| Field | Type | Description |
|---|---|---|
name |
string | Unique policy name |
description |
string | Human-readable description |
weight |
int | Priority when multiple policies match (higher = preferred) |
capacityTypes |
LabelSelectorArgs |
Capacity types: on-demand | spot | reserved |
instanceCategories |
LabelSelectorArgs |
Filter by instance category letter: e.g. m, c, r (AWS) or D, E (Azure) |
instanceFamilies |
LabelSelectorArgs |
Filter instance families (e.g. c5, m5) |
instanceCpus |
LabelSelectorArgs |
Filter by vCPU count |
instanceSizes |
LabelSelectorArgs |
Filter instance sizes (e.g. large, xlarge) |
instanceTypes |
LabelSelectorArgs |
Explicit instance types (e.g. m5.xlarge) |
instanceGenerations |
LabelSelectorArgs |
Filter by instance generation number (e.g. 2, 3) |
instanceHypervisors |
LabelSelectorArgs |
Filter by hypervisor type (e.g. nitro) |
zones |
LabelSelectorArgs |
Availability zones to provision into |
architectures |
LabelSelectorArgs |
CPU architectures (e.g. amd64, arm64) |
operatingSystems |
LabelSelectorArgs |
OS filter (e.g. linux, windows) |
labels |
map[string]string | Labels applied to provisioned nodes |
taints |
TaintArgs[] |
Taints applied to provisioned nodes |
disruption |
DisruptionPolicyArgs |
Node disruption and consolidation settings |
limits |
ResourceLimitsArgs |
Max total CPU/memory this policy may provision |
nodePoolName |
string | Override name for the generated dzkarp NodePool CR |
nodeClassName |
string | Override name for the generated dzkarp NodeClass CR |
aws |
AWSNodeClassSpecArgs |
AWS-specific configuration (AMI, subnets, IAM role, EBS, etc.) |
azure |
AzureNodeClassSpecArgs |
Azure-specific configuration (subnet, image family, disk, etc.) |
raw |
RawKarpenterSpecArgs[] |
Raw Karpenter NodePool/NodeClass YAML (escape hatch) |
Python uses snake_case (e.g. capacity_types, instance_categories, instance_families, instance_cpus, instance_sizes, instance_types, operating_systems, node_pool_name, node_class_name). Go uses PascalCase equivalents.
DisruptionPolicyArgs
| Field | Type | Description |
|---|---|---|
consolidationPolicy |
string | WhenEmpty | WhenEmptyOrUnderutilized |
consolidateAfter |
string | Wait time after a node is empty before consolidating (e.g. 30s) |
expireAfter |
string | Force-replace nodes after this duration (e.g. 720h) |
ttlSecondsAfterEmpty |
int | Seconds before an empty node is terminated. Deprecated — prefer consolidateAfter |
terminationGracePeriodSeconds |
int | Grace period before forcefully terminating a draining node |
budgets |
DisruptionBudgetArgs[] |
Limits on how many nodes may be disrupted at once |
DisruptionBudgetArgs
| Field | Type | Description |
|---|---|---|
nodes |
string | Max nodes that can be disrupted at once. Absolute (e.g. "1") or percentage (e.g. "10%") |
reasons |
string[] | Disruption reasons this budget applies to: Empty, Drifted, Underutilized. Omit to apply to all. |
schedule |
string | Cron expression restricting when this budget is active |
duration |
string | Duration the budget is active per schedule cycle (e.g. "1h") |
AWSNodeClassSpecArgs
| Field | Type | Description |
|---|---|---|
amiFamily |
string | AMI family: AL2, AL2023, Bottlerocket, Windows2019, Windows2022 |
role |
string | IAM role name for nodes (dzkarp creates the instance profile) |
instanceProfile |
string | IAM instance profile name (alternative to role) |
subnetSelectorTerms |
SubnetSelectorTermArgs[] |
Subnet selectors (by tag or ID) |
securityGroupSelectorTerms |
SecurityGroupSelectorTermArgs[] |
Security group selectors |
capacityReservationSelectorTerms |
CapacityReservationSelectorTermArgs[] |
EC2 capacity reservation selectors |
amiSelectorTerms |
AMISelectorTermArgs[] |
AMI selectors (by alias, tag, or ID) |
blockDeviceMappings |
BlockDeviceMappingArgs[] |
EBS volume configuration |
instanceStorePolicy |
string | NVMe instance store policy. Value: INSTANCE_STORE_POLICY_RAID0 |
tags |
map[string]string | AWS tags applied to all provisioned resources |
associatePublicIpAddress |
bool | Assign a public IP to nodes |
detailedMonitoring |
bool | Enable CloudWatch detailed monitoring |
metadataOptions |
MetadataOptionsArgs |
EC2 IMDS options (IMDSv2, hop limit, etc.) |
kubelet |
KubeletConfigurationArgs |
Kubelet overrides (maxPods, eviction thresholds, etc.) |
userData |
string | Custom launch template user data |
context |
string | Additional EC2 launch template context ARN for advanced customization |
Python: ami_family, instance_profile, subnet_selector_terms, security_group_selector_terms, capacity_reservation_selector_terms, ami_selector_terms, block_device_mappings, instance_store_policy, associate_public_ip_address, detailed_monitoring, metadata_options. Go uses PascalCase equivalents.
AzureNodeClassSpecArgs
| Field | Type | Description |
|---|---|---|
vnetSubnetId |
string | Azure VNet subnet resource ID |
imageFamily |
string | Image family: AzureLinux, Ubuntu2204, etc. |
osDiskSizeGb |
int | OS disk size in GB |
fipsMode |
string | Enabled | Disabled |
maxPods |
int | Max pods per node |
tags |
map[string]string | Azure tags on provisioned resources |
kubelet |
AzureKubeletConfigurationArgs |
Kubelet overrides for Azure nodes |
Python: vnet_subnet_id, image_family, os_disk_size_gb, fips_mode, max_pods. Go uses PascalCase equivalents.
RawKarpenterSpecArgs
Use this as an escape hatch when the structured fields don't cover your use case and you need full control over the Karpenter NodePool/NodeClass resources.
| Field | Type | Description |
|---|---|---|
nodepoolYaml |
string | Raw YAML for a complete dzkarp NodePool resource |
nodeclassYaml |
string | Raw YAML for a complete dzkarp NodeClass resource |
Python: nodepool_yaml, nodeclass_yaml. Go: NodepoolYaml, NodeclassYaml.
WorkloadRule
A WorkloadRule pins explicit resource rules directly to a single workload (a specific kind/namespace/name on a cluster). Unlike WorkloadPolicy, which applies a shared policy to many workloads via a WorkloadPolicyTarget, a WorkloadRule targets one workload and lets you override CPU, memory, GPU, and HPA settings with precise values.
Set autoGenerate: true to have the engine automatically compute all rule fields from observed usage. Omit it (or set it to false) to provide your own values via cpuRule, memoryRule, hpaRule, etc.
TypeScript
import * as pulumi from "@pulumi/pulumi";
import { resources } from "@devzero/pulumi-devzero";
// Pin explicit CPU + memory rules to a single Deployment
const rule = new resources.WorkloadRule("my-app-rule", {
clusterId: "cluster-abc123",
namespace: "production",
kind: "Deployment",
name: "my-api",
cpuRule: {
enabled: true,
minRequest: 10, // 10m CPU
maxRequest: 4000, // 4 cores
targetPercentile: 0.95,
limitsAdjustmentEnabled: true,
limitMultiplier: 1.5,
},
memoryRule: {
enabled: true,
minRequest: 67108864, // 64 MiB
maxRequest: 536870912, // 512 MiB
},
emergencyResponse: {
oomEnabled: true,
oomMemoryMultiplier: 1.5,
oomMaxReactions: 3,
oomCooldownSeconds: 60,
cpuThrottlingEnabled: true,
cpuThrottlingThreshold: 0.1,
cpuThrottlingMultiplier: 1.25,
},
actionTriggers: ["on_detection"],
detectionTriggers: ["pod_creation", "pod_reschedule"],
cooldownMinutes: 60,
});
export const ruleId = rule.id;
Auto-generate: Replace the rule body with
autoGenerate: trueto let the engine fill in all fields from observed usage data.const rule = new resources.WorkloadRule("my-app-rule", { clusterId: "cluster-abc123", namespace: "production", kind: "Deployment", name: "my-api", autoGenerate: true, });
Python
import pulumi
from pulumi_devzero.resources import (
WorkloadRule, WorkloadRuleArgs,
ResourceRuleConfigArgsArgs,
EmergencyResponseConfigArgsArgs,
)
# Pin explicit CPU + memory rules to a single Deployment
rule = WorkloadRule(
"my-app-rule",
args=WorkloadRuleArgs(
cluster_id="cluster-abc123",
namespace="production",
kind="Deployment",
name="my-api",
cpu_rule=ResourceRuleConfigArgsArgs(
enabled=True,
min_request=10, # 10m CPU
max_request=4000, # 4 cores
target_percentile=0.95,
limits_adjustment_enabled=True,
limit_multiplier=1.5,
),
memory_rule=ResourceRuleConfigArgsArgs(
enabled=True,
min_request=67108864, # 64 MiB
max_request=536870912, # 512 MiB
),
emergency_response=EmergencyResponseConfigArgsArgs(
oom_enabled=True,
oom_memory_multiplier=1.5,
oom_max_reactions=3,
oom_cooldown_seconds=60,
cpu_throttling_enabled=True,
cpu_throttling_threshold=0.1,
cpu_throttling_multiplier=1.25,
),
action_triggers=["on_detection"],
detection_triggers=["pod_creation", "pod_reschedule"],
cooldown_minutes=60,
),
)
pulumi.export("rule_id", rule.id)
Auto-generate:
rule = WorkloadRule("my-app-rule", args=WorkloadRuleArgs( cluster_id="cluster-abc123", namespace="production", kind="Deployment", name="my-api", auto_generate=True, ))
Go
rule, err := resources.NewWorkloadRule(ctx, "my-app-rule", &resources.WorkloadRuleArgs{
ClusterId: pulumi.String("cluster-abc123"),
Namespace: pulumi.String("production"),
Kind: pulumi.String("Deployment"),
Name: pulumi.String("my-api"),
CpuRule: resources.ResourceRuleConfigArgsArgs{
Enabled: pulumi.BoolPtr(true),
MinRequest: pulumi.IntPtr(10), // 10m CPU
MaxRequest: pulumi.IntPtr(4000), // 4 cores
TargetPercentile: pulumi.Float64Ptr(0.95),
LimitsAdjustmentEnabled: pulumi.BoolPtr(true),
LimitMultiplier: pulumi.Float64Ptr(1.5),
}.ToResourceRuleConfigArgsPtrOutput(),
MemoryRule: resources.ResourceRuleConfigArgsArgs{
Enabled: pulumi.BoolPtr(true),
MinRequest: pulumi.IntPtr(67108864), // 64 MiB
MaxRequest: pulumi.IntPtr(536870912), // 512 MiB
}.ToResourceRuleConfigArgsPtrOutput(),
EmergencyResponse: resources.EmergencyResponseConfigArgsArgs{
OomEnabled: pulumi.BoolPtr(true),
OomMemoryMultiplier: pulumi.Float64Ptr(1.5),
OomMaxReactions: pulumi.IntPtr(3),
OomCooldownSeconds: pulumi.IntPtr(60),
CpuThrottlingEnabled: pulumi.BoolPtr(true),
CpuThrottlingThreshold: pulumi.Float64Ptr(0.1),
CpuThrottlingMultiplier: pulumi.Float64Ptr(1.25),
}.ToEmergencyResponseConfigArgsPtrOutput(),
ActionTriggers: pulumi.StringArray{pulumi.String("on_detection")},
DetectionTriggers: pulumi.StringArray{pulumi.String("pod_creation"), pulumi.String("pod_reschedule")},
CooldownMinutes: pulumi.IntPtr(60),
})
if err != nil {
return err
}
ctx.Export("ruleId", rule.ID())
Auto-generate:
rule, err := resources.NewWorkloadRule(ctx, "my-app-rule", &resources.WorkloadRuleArgs{ ClusterId: pulumi.String("cluster-abc123"), Namespace: pulumi.String("production"), Kind: pulumi.String("Deployment"), Name: pulumi.String("my-api"), AutoGenerate: pulumi.BoolPtr(true), })
WorkloadRule — Key Fields
| Field | Type | Description |
|---|---|---|
clusterId |
string | ID of the cluster the workload lives in |
namespace |
string | Kubernetes namespace of the workload |
kind |
string | Workload kind: Deployment | StatefulSet | DaemonSet | CronJob | Job |
name |
string | Name of the Kubernetes workload |
autoGenerate |
bool | When true, the engine fills all rule fields from observed usage; manual fields are ignored |
cpuRule |
ResourceRuleConfigArgs |
CPU vertical scaling rule |
memoryRule |
ResourceRuleConfigArgs |
Memory vertical scaling rule |
gpuRule |
ResourceRuleConfigArgs |
GPU vertical scaling rule (units: GPU millicores) |
hpaRule |
HPARuleConfigArgs |
Horizontal (replica) scaling rule |
emergencyResponse |
EmergencyResponseConfigArgs |
OOM and CPU-throttle emergency reactions |
actionTriggers |
string[] | When to apply: on_detection | on_schedule |
cronSchedule |
string | Cron expression for scheduled application (5-field UTC). Required when actionTriggers includes on_schedule |
detectionTriggers |
string[] | Events that trigger a recommendation: pod_creation | pod_update | pod_reschedule |
startupPeriodSeconds |
int | Seconds after workload start to exclude from usage data |
cooldownMinutes |
int | Minimum minutes between consecutive recommendation applications |
schedulerPlugins |
string[] | Kubernetes scheduler plugins to activate. Example: ["binpacking"] |
defragmentationSchedule |
string | Cron expression for node defragmentation |
liveMigrationEnabled |
bool | Allow live pod migration when applying recommendations without restart |
useInPlaceVerticalScaling |
bool | Use in-place pod vertical scaling instead of pod restarts |
containers |
ContainerResourceRuleConfigArgs[] |
Per-container resource overrides. When empty, workload-level rules apply to all containers |
ResourceRuleConfigArgs
Used for cpuRule, memoryRule, and gpuRule at both the workload and per-container level.
Note:
maxScaleUpPercentandmaxScaleDownPercentare not supported on per-container rules — set them on the workload-level fields instead.
| Field | Type | Description |
|---|---|---|
enabled |
bool | Enable this resource axis rule |
minRequest |
int | Minimum resource request (millicores for CPU, bytes for memory/GPU) |
maxRequest |
int | Maximum resource request |
targetPercentile |
float | Percentile of observed usage to target (0–1). Example: 0.95 |
maxScaleUpPercent |
float | Maximum percentage to scale up in one step (workload-level only) |
maxScaleDownPercent |
float | Maximum percentage to scale down in one step (workload-level only) |
limitsAdjustmentEnabled |
bool | Whether to also adjust resource limits |
limitMultiplier |
float | Limits = request × limitMultiplier |
limitsRemovalEnabled |
bool | Actively remove limits from workloads (CPU only) |
HPARuleConfigArgs
| Field | Type | Description |
|---|---|---|
enabled |
bool | Enable horizontal (replica) scaling |
minReplicas |
int | Minimum number of replicas |
maxReplicas |
int | Maximum number of replicas |
targetUtilization |
float | Target CPU utilization ratio (0–1). Example: 0.8 |
targetMemoryUtilization |
float | Target memory utilization ratio (0–1), tuned independently of CPU. Example: 0.65 |
primaryMetric |
string | Primary metric driving HPA (used when metrics is empty): cpu | memory | gpu | network_ingress | network_egress. Example: "memory" |
maxReplicaChangePercent |
float | Maximum fraction of current replicas that can change in one scale event (0–1). 0.25 means at most 25% added or removed at once. Example: 0.25 |
scaleDownCooldownSeconds |
int | Seconds to wait between scale-down events. Example: 300 |
metrics |
HPAMetricTriggerArgs[] |
Additional metric triggers (e.g. Prometheus). CPU/Memory/Network are auto-generated by the engine from primaryMetric — do not redeclare them here |
compositeFormula |
string | Expression combining multiple metric ratios into one scaling signal. Example: "0.6*cpu + 0.4*memory" |
behavior |
HPABehaviorArgs |
Fine-grained scale-up and scale-down behavior policies |
fallback |
HPAFallbackArgs |
Replica fallback when metrics become unavailable |
Python uses snake_case (e.g. target_memory_utilization, scale_down_cooldown_seconds, composite_formula). Go uses PascalCase equivalents.
HPAMetricTriggerArgs
Note: CPU, Memory, and Network triggers are auto-generated by the engine from
primaryMetric+targetUtilization. Only add entries here for external metrics (e.g. Prometheus). Redeclaring built-in metrics here will result in duplicate triggers.
| Field | Type | Description |
|---|---|---|
type |
string | Metric source type. Built-in: CPU, Memory, NetworkIngress, NetworkEgress. External: prometheus |
targetUtilization |
string | Target utilization as a decimal string (resource metrics). Example: "0.70" |
targetValue |
string | Absolute target value as a string (external/object metrics). Example: "50000000" |
weight |
string | Weight for composite formula scaling (decimal string). Example: "0.5" |
metadata |
map[string]string | Free-form key-value pairs passed to the external scaler |
serverAddress |
string | Prometheus server URL — packed into metadata by the service layer. Example: "http://prometheus:9090" |
query |
string | PromQL query string — packed into metadata by the service layer. Example: "sum(rate(http_requests_total[2m]))" |
Python: target_utilization, target_value, server_address. Go uses PascalCase equivalents.
Prometheus-driven HPA example
// TypeScript
const rule = new resources.WorkloadRule("my-app-rule", {
clusterId: "cluster-abc123",
namespace: "production",
kind: "Deployment",
name: "my-api",
hpaRule: {
enabled: true,
minReplicas: 1,
maxReplicas: 8,
primaryMetric: "memory", // primary scaling signal
targetUtilization: 0.8, // CPU target — 80%
targetMemoryUtilization: 0.65, // Memory target — 65%
maxReplicaChangePercent: 0.25, // at most 25% of replicas changed per event
scaleDownCooldownSeconds: 300, // 5 min between scale-downs
// Only add external metrics here — CPU/Memory are auto-generated above
metrics: [
{
type: "prometheus",
targetValue: "100",
serverAddress: "http://prometheus.monitoring:9090",
query: "sum(rate(http_requests_total{app='my-api'}[2m]))",
},
],
compositeFormula: "0.6*cpu + 0.4*prometheus",
fallback: {
replicas: 1,
behavior: "currentReplicas", // keep whatever is running when metrics fail
failureThreshold: 3,
},
behavior: {
scaleDown: {
stabilizationWindowSeconds: 300, // wait 5 min before scaling down
selectPolicy: "Min",
policies: [
{ type: "Pods", value: 1, periodSeconds: 60 }, // remove at most 1 pod/min
],
},
scaleUp: {
stabilizationWindowSeconds: 0, // scale up immediately
selectPolicy: "Max",
policies: [
{ type: "Percent", value: 100, periodSeconds: 60 }, // double replicas/min
{ type: "Pods", value: 4, periodSeconds: 60 }, // or add 4 pods/min
],
},
},
},
});
# Python
from pulumi_devzero.resources import (
WorkloadRule, WorkloadRuleArgs,
HPARuleConfigArgsArgs,
HPAMetricTriggerArgsArgs,
HPAFallbackArgsArgs,
HPABehaviorArgsArgs,
HPAScalingRulesArgsArgs,
HPAScalingPolicyArgsArgs,
)
rule = WorkloadRule("my-app-rule", args=WorkloadRuleArgs(
cluster_id="cluster-abc123",
namespace="production",
kind="Deployment",
name="my-api",
hpa_rule=HPARuleConfigArgsArgs(
enabled=True,
min_replicas=1,
max_replicas=8,
primary_metric="memory", # primary scaling signal
target_utilization=0.8, # CPU target — 80%
target_memory_utilization=0.65, # Memory target — 65%
max_replica_change_percent=0.25, # at most 25% of replicas changed per event
scale_down_cooldown_seconds=300, # 5 min between scale-downs
# Only add external metrics here — CPU/Memory are auto-generated above
metrics=[HPAMetricTriggerArgsArgs(
type="prometheus",
target_value="100",
server_address="http://prometheus.monitoring:9090",
query="sum(rate(http_requests_total{app='my-api'}[2m]))",
)],
composite_formula="0.6*cpu + 0.4*prometheus",
fallback=HPAFallbackArgsArgs(
replicas=1,
behavior="currentReplicas", # keep whatever is running when metrics fail
failure_threshold=3,
),
behavior=HPABehaviorArgsArgs(
scale_down=HPAScalingRulesArgsArgs(
stabilization_window_seconds=300, # wait 5 min before scaling down
select_policy="Min",
policies=[HPAScalingPolicyArgsArgs(type="Pods", value=1, period_seconds=60)], # remove at most 1 pod/min
),
scale_up=HPAScalingRulesArgsArgs(
stabilization_window_seconds=0, # scale up immediately
select_policy="Max",
policies=[
HPAScalingPolicyArgsArgs(type="Percent", value=100, period_seconds=60), # double replicas/min
HPAScalingPolicyArgsArgs(type="Pods", value=4, period_seconds=60), # or add 4 pods/min
],
),
),
),
))
// Go
rule, err := resources.NewWorkloadRule(ctx, "my-app-rule", &resources.WorkloadRuleArgs{
ClusterId: pulumi.String("cluster-abc123"),
Namespace: pulumi.String("production"),
Kind: pulumi.String("Deployment"),
Name: pulumi.String("my-api"),
HpaRule: resources.HPARuleConfigArgsArgs{
Enabled: pulumi.BoolPtr(true),
MinReplicas: pulumi.IntPtr(1),
MaxReplicas: pulumi.IntPtr(8),
PrimaryMetric: pulumi.StringPtr("memory"), // primary scaling signal
TargetUtilization: pulumi.Float64Ptr(0.8), // CPU target — 80%
TargetMemoryUtilization: pulumi.Float64Ptr(0.65), // Memory target — 65%
MaxReplicaChangePercent: pulumi.Float64Ptr(0.25), // at most 25% of replicas changed per event
ScaleDownCooldownSeconds: pulumi.IntPtr(300), // 5 min between scale-downs
CompositeFormula: pulumi.StringPtr("0.6*cpu + 0.4*prometheus"),
// Only add external metrics here — CPU/Memory are auto-generated above
Metrics: resources.HPAMetricTriggerArgsArray{
resources.HPAMetricTriggerArgsArgs{
Type: pulumi.String("prometheus"),
TargetValue: pulumi.StringPtr("100"),
ServerAddress: pulumi.StringPtr("http://prometheus.monitoring:9090"),
Query: pulumi.StringPtr("sum(rate(http_requests_total{app='my-api'}[2m]))"),
},
},
Fallback: resources.HPAFallbackArgsArgs{
Replicas: pulumi.Int(1),
Behavior: pulumi.String("currentReplicas"), // keep whatever is running when metrics fail
FailureThreshold: pulumi.Int(3),
}.ToHPAFallbackArgsPtrOutput(),
Behavior: resources.HPABehaviorArgsArgs{
ScaleDown: resources.HPAScalingRulesArgsArgs{
StabilizationWindowSeconds: pulumi.Int(300), // wait 5 min before scaling down
SelectPolicy: pulumi.String("Min"),
Policies: resources.HPAScalingPolicyArgsArray{
// remove at most 1 pod per minute
resources.HPAScalingPolicyArgsArgs{Type: pulumi.String("Pods"), Value: pulumi.Int(1), PeriodSeconds: pulumi.Int(60)},
},
}.ToHPAScalingRulesArgsPtrOutput(),
ScaleUp: resources.HPAScalingRulesArgsArgs{
StabilizationWindowSeconds: pulumi.Int(0), // scale up immediately
SelectPolicy: pulumi.String("Max"),
Policies: resources.HPAScalingPolicyArgsArray{
// double replicas per minute
resources.HPAScalingPolicyArgsArgs{Type: pulumi.String("Percent"), Value: pulumi.Int(100), PeriodSeconds: pulumi.Int(60)},
// or add up to 4 pods per minute
resources.HPAScalingPolicyArgsArgs{Type: pulumi.String("Pods"), Value: pulumi.Int(4), PeriodSeconds: pulumi.Int(60)},
},
}.ToHPAScalingRulesArgsPtrOutput(),
}.ToHPABehaviorArgsPtrOutput(),
}.ToHPARuleConfigArgsPtrOutput(),
})
HPAFallbackArgs
| Field | Type | Description |
|---|---|---|
replicas |
int | Replica count to use when metrics are unavailable. Example: 1 |
behavior |
string | How to apply fallback replicas. One of: static (always use replicas), currentReplicas (keep whatever is running), currentReplicasIfHigher (use current only if higher), currentReplicasIfLower (use current only if lower). Example: "currentReplicas" |
failureThreshold |
int | Consecutive metric failures before fallback activates. Example: 3 |
HPABehaviorArgs
| Field | Type | Description |
|---|---|---|
scaleUp |
HPAScalingRulesArgs |
Scale-up rate limiting and stabilization |
scaleDown |
HPAScalingRulesArgs |
Scale-down rate limiting and stabilization |
HPAScalingRulesArgs
| Field | Type | Description |
|---|---|---|
stabilizationWindowSeconds |
int | Seconds to look back when selecting replica count to avoid flapping. Default: 0 for scale-up, 300 for scale-down |
selectPolicy |
string | Which policy wins when multiple match: Max | Min | Disabled. Example: "Max" |
policies |
HPAScalingPolicyArgs[] |
List of rate-limiting step policies |
HPAScalingPolicyArgs
| Field | Type | Description |
|---|---|---|
type |
string | Policy type: Pods (absolute count) | Percent (% of current replicas). Example: "Percent" |
value |
int | Maximum change allowed per period. Example: 100 |
periodSeconds |
int | Time window for this policy in seconds. Example: 60 |
Python: stabilization_window_seconds, select_policy, period_seconds. Go uses PascalCase equivalents.
EmergencyResponseConfigArgs
| Field | Type | Description |
|---|---|---|
oomEnabled |
bool | React to OOM kills by increasing memory requests |
oomMemoryMultiplier |
float | Multiplier applied to memory on OOM. Example: 1.5 |
oomMaxReactions |
int | Maximum OOM reactions before giving up |
oomCooldownSeconds |
int | Seconds to wait between OOM reactions |
cpuThrottlingEnabled |
bool | React to CPU throttling by increasing CPU requests |
cpuThrottlingThreshold |
float | Throttle ratio (0–1) that triggers a reaction. Example: 0.1 |
cpuThrottlingMultiplier |
float | Multiplier applied to CPU request on throttle reaction. Example: 1.25 |
ContainerResourceRuleConfigArgs
| Field | Type | Description |
|---|---|---|
containerName |
string | Name of the container this config applies to |
cpuRule |
ResourceRuleConfigArgs |
CPU rule for this container |
memoryRule |
ResourceRuleConfigArgs |
Memory rule for this container |
gpuRule |
ResourceRuleConfigArgs |
GPU rule for this container |
NodePolicyTarget — Key Fields
| Field | Type | Description |
|---|---|---|
name |
string | Unique target name |
policyId |
string | ID of the NodePolicy to apply |
clusterIds |
string[] | Cluster IDs to target. At most 1 entry — the backend rejects more than one. |
description |
string | Human-readable description (optional) |
enabled |
bool | Activate the target. Default: true |
Python: policy_id, cluster_ids. Go: PolicyId, ClusterIds.
Note:
pulumi destroyremoves this resource from Pulumi state but does not delete it on the DevZero backend. You must remove it manually via the dashboard or API if needed.
Destroying Resources
To tear down all resources managed by your stack:
pulumi destroy
To also remove the stack itself:
pulumi stack rm <stack-name>
Building from Source
# Build the provider binary
make build
# Run tests
make test
# Regenerate schema and all SDKs (requires Pulumi CLI)
make sdk
# Install binary to $GOPATH/bin
make install
See CONTRIBUTING.md for full development instructions.
Examples
Ready-to-run examples live in examples/:
| Language | Path |
|---|---|
| TypeScript | examples/typescript/ |
| Python | examples/python/ |
| Go | examples/go/ |
License
MIT — Copyright (c) 2026 DevZero Inc.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pulumi_devzero-0.1.15.tar.gz.
File metadata
- Download URL: pulumi_devzero-0.1.15.tar.gz
- Upload date:
- Size: 99.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62c457dcc1b80d54f7a58f13b7afff1fb38b9c5105c15c11c1482f90df97f55e
|
|
| MD5 |
5db9a9a1dd188e0918fcf6b80406aa60
|
|
| BLAKE2b-256 |
6da976d8dd2eeb47fcf15e2823dd2a52ce138e90bb3e0bee1ff8a274f62ba427
|
File details
Details for the file pulumi_devzero-0.1.15-py3-none-any.whl.
File metadata
- Download URL: pulumi_devzero-0.1.15-py3-none-any.whl
- Upload date:
- Size: 80.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c8de7f1f2123d7f3fb886d75fb6c6938ac8f22ba10302adfabc05ab18f2e7d9a
|
|
| MD5 |
b16fab9bf137889fe12983bde722fd5a
|
|
| BLAKE2b-256 |
b9c1b5c1cbce707b786144b4490d6d200d343435ef2f6f8c36e2dea51c77cf56
|