Pulumi EKS ML Infrastructure
Project description
Pulumi EKS ML Infrastructure
An opinionated library for multi-tenant, multi-region Machine Learning platforms on AWS.
This repository provides a modular set of Pulumi components (pulumi_eks_ml) to spin up multi-tenant, multi-region ML infrastructure with minimal pain.
💡 Philosophy
This project treats infrastructure as a composable library. Instead of one giant deployment, you get modular building blocks (VPC, EKS, GPU Node Pools) that you can assemble into your own topology.
Whether it's a single cluster for testing or a global mesh for distributed workloads, you can define your architecture once in Python, then deploy identical copies across different environments thanks to Pulumi stacks.
Architectural examples with pulumi_eks_ml
| Project | Description | Architecture |
|---|---|---|
| Starter | Single VPC, single EKS cluster with recommended addons. | diagram |
| EKS Multi-Region | Full-mesh VPC peering across regions, each with an EKS cluster. | diagram |
| SkyPilot Multi-Tenant | Hub-and-Spoke multi-region network with multi-tenant SkyPilot API server, Cognito auth, Tailscale VPN, and isolated data planes. | diagram |
⚡ Quickstart
Use the starter project as the fastest path to a working EKS cluster.
# __main__.py
import pulumi
from pulumi_eks_ml import eks, eks_addons, vpc
main_region = pulumi.Config("aws").require("region")
cfg = pulumi.Config()
deployment_name = f"{pulumi.get_project()}-{pulumi.get_stack()}"
node_pools_config = cfg.require_object("node_pools")
node_pools = [eks.NodePoolConfig.from_dict(pool) for pool in node_pools_config]
vpc_resource = vpc.VPC(
name=f"{deployment_name}-vpc",
cidr_block="10.0.0.0/16",
setup_internet_egress=True,
)
cluster = eks.EKSCluster(
f"{deployment_name}-cls",
vpc_id=vpc_resource.vpc_id,
subnet_ids=vpc_resource.private_subnet_ids,
node_pools=node_pools,
)
eks.cluster.EKSClusterAddonInstaller(
f"{deployment_name}-addons",
cluster=cluster,
addon_types=eks_addons.recommended_addons(),
)
pulumi.export("vpc_id", vpc_resource.vpc_id)
pulumi.export("cluster_name", cluster.cluster_name)
uv sync --dev
cd projects/starter
pulumi stack init dev
pulumi config set aws:region us-west-2
uv run pulumi up
🚀 Key Features
- ML-Optimized Compute: Pre-configured EKS clusters with Karpenter for autoscaling (Spot/On-Demand) and NVIDIA GPU drivers ready to go.
- Global Networking: Easy Multi-Region connectivity with Hub-and-Spoke or Full Mesh VPC peering topologies.
- Opinionated Add-ons for ML: Built-in support for ALB Controller, EBS/EFS CSI drivers, FluentBit, Metrics Server, etc...
- Secure network with Tailscale: Secure network with Tailscale for VPN access, in additional to public/private subnet isolation.
- SkyPilot Multi-Tenant Platform: Opinionated deployment of SkyPilot for multi-tenant, multi-region AI workloads.
📂 Repository Structure
pulumi_eks_ml/: The core Python library containing reusable infrastructure components.projects/: Reference implementations and live infrastructure code.starter/: A simple single-region EKS cluster.multi-region/: A full-mesh global network connecting clusters across regions.skypilot-multi-tenant/: A SkyPilot platform with isolated data planes for multiple teams.
🛠 Getting Started
Prerequisites
- Pulumi CLI
- Python 3.12+
- uv (recommended)
1. Install & Setup
# Clone the repo
git clone https://github.com/Roulbac/pulumi-eks-ml.git
cd pulumi-eks-ml
# Install dependencies
uv sync --dev
2. Deploy a Project
Navigate to one of the reference projects to see it in action.
cd projects/starter
# Initialize your stack (e.g., dev)
pulumi stack init dev
# Deploy
uv run pulumi up
For custom infrastructure, create a new folder in projects/, import pulumi_eks_ml, and define your topology (see projects/starter/__main__.py for a template).
🧪 Testing
We include both unit and integration tests (using LocalStack).
# Run Unit Tests
uv run pytest -vv tests/unit
# Run Integration Tests
uv run pytest -vv tests/integration
📄 License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pulumi_eks_ml-0.2.0.tar.gz.
File metadata
- Download URL: pulumi_eks_ml-0.2.0.tar.gz
- Upload date:
- Size: 32.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a74112316b8070d1c67b128a131a815b204cd525fbc5399a62b61d41cc7529d0
|
|
| MD5 |
4128aba7a28e0582dc8e8cf98c71c93e
|
|
| BLAKE2b-256 |
0908fa4ae287d5f72685a527be62e439bd5f943ef52d7a03a845a9a460ad9ccf
|
Provenance
The following attestation bundles were made for pulumi_eks_ml-0.2.0.tar.gz:
Publisher:
publish.yml on Roulbac/pulumi-eks-ml
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pulumi_eks_ml-0.2.0.tar.gz -
Subject digest:
a74112316b8070d1c67b128a131a815b204cd525fbc5399a62b61d41cc7529d0 - Sigstore transparency entry: 926946969
- Sigstore integration time:
-
Permalink:
Roulbac/pulumi-eks-ml@084426b763d81e1143a75bb3f853394cbbd0b96d -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Roulbac
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@084426b763d81e1143a75bb3f853394cbbd0b96d -
Trigger Event:
release
-
Statement type:
File details
Details for the file pulumi_eks_ml-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pulumi_eks_ml-0.2.0-py3-none-any.whl
- Upload date:
- Size: 48.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7e3ee5f71c97ce1467f4fbabcaa039546d4c640ea1f84e0b8765da3d3bdd5f5
|
|
| MD5 |
ddbe89987019c1e665e2d7253bd25178
|
|
| BLAKE2b-256 |
132216c1c6142626b3378254ef1af560ec7c0114a00d681ebe4bc6fb79312323
|
Provenance
The following attestation bundles were made for pulumi_eks_ml-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on Roulbac/pulumi-eks-ml
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pulumi_eks_ml-0.2.0-py3-none-any.whl -
Subject digest:
e7e3ee5f71c97ce1467f4fbabcaa039546d4c640ea1f84e0b8765da3d3bdd5f5 - Sigstore transparency entry: 926946970
- Sigstore integration time:
-
Permalink:
Roulbac/pulumi-eks-ml@084426b763d81e1143a75bb3f853394cbbd0b96d -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/Roulbac
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@084426b763d81e1143a75bb3f853394cbbd0b96d -
Trigger Event:
release
-
Statement type: