Kubeflow Python SDK to manage ML workloads and to interact with Kubeflow APIs.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

andreyvelich

These details have not been verified by PyPI

Project links

Documentation

Project description

Kubeflow SDK

Overview

The Kubeflow SDK is a set of unified Pythonic APIs that let you run any AI workload at any scale – without the need to learn Kubernetes. It provides simple and consistent APIs across the Kubeflow ecosystem, enabling users to focus on building AI applications rather than managing complex infrastructure.

Kubeflow SDK Benefits

Unified Experience: Single SDK to interact with multiple Kubeflow projects through consistent Python APIs
Simplified AI Workloads: Abstract away Kubernetes complexity and work effortlessly across all Kubeflow projects using familiar Python APIs
Built for Scale: Seamlessly scale any AI workload — from local laptop to large-scale production cluster with thousands of GPUs using the same APIs.
Rapid Iteration: Reduced friction between development and production environments
Local Development: First-class support for local development without a Kubernetes cluster requiring only pip installation

Kubeflow SDK Introduction

The following KubeCon + CloudNativeCon 2025 talk provides an overview of Kubeflow SDK:

Additionally, check out these demos to deep dive into Kubeflow SDK capabilities:

Get Started

Install Kubeflow SDK

pip install -U kubeflow

Run your first PyTorch distributed job

from kubeflow.trainer import TrainerClient, CustomTrainer, TrainJobTemplate

def get_torch_dist(learning_rate: str, num_epochs: str):
    import os
    import torch
    import torch.distributed as dist

    dist.init_process_group(backend="gloo")
    print("PyTorch Distributed Environment")
    print(f"WORLD_SIZE: {dist.get_world_size()}")
    print(f"RANK: {dist.get_rank()}")
    print(f"LOCAL_RANK: {os.environ['LOCAL_RANK']}")

    lr = float(learning_rate)
    epochs = int(num_epochs)
    loss = 1.0 - (lr * 2) - (epochs * 0.01)

    if dist.get_rank() == 0:
        print(f"loss={loss}")

# Create the TrainJob template
template = TrainJobTemplate(
    runtime="torch-distributed",
    trainer=CustomTrainer(
        func=get_torch_dist,
        func_args={"learning_rate": "0.01", "num_epochs": "5"},
        num_nodes=3,
        resources_per_node={"cpu": 2},
    ),
)

# Create the TrainJob
job_id = TrainerClient().train(**template)

# Wait for TrainJob to complete
TrainerClient().wait_for_job_status(job_id)

# Print TrainJob logs
print("\n".join(TrainerClient().get_job_logs(name=job_id)))

Optimize hyperparameters for your training

from kubeflow.optimizer import OptimizerClient, Search, TrialConfig

# Create OptimizationJob with the same template
optimization_id = OptimizerClient().optimize(
    trial_template=template,
    trial_config=TrialConfig(num_trials=10, parallel_trials=2),
    search_space={
        "learning_rate": Search.loguniform(0.001, 0.1),
        "num_epochs": Search.choice([5, 10, 15]),
    },
)

print(f"OptimizationJob created: {optimization_id}")

Run data processing with Spark Connect

Install Kubeflow Spark support:

pip install 'kubeflow[spark]'

To install the Spark Operator, see the installation guide.

from kubeflow.spark import KubernetesBackendConfig, SparkClient

client = SparkClient(KubernetesBackendConfig(namespace="spark-test"))
spark = client.connect()

df = spark.range(5)
df.show()

You should see the DataFrame:

+---+
| id|
+---+
|  0|
|  1|
|  2|
|  3|
|  4|
+---+

You can also configure number of executors and resources:

spark = client.connect(
    num_executors=5,
    resources_per_executor={"cpu": "5", "memory": "1Gi"},
)

df = spark.range(5)
df.show()

Manage models with Model Registry

Install Model Registry support:

pip install 'kubeflow[hub]'

To install the Model Registry server, see the installation guide.

from kubeflow.hub import ModelRegistryClient

client = ModelRegistryClient("https://model-registry.kubeflow.svc.cluster.local", author="Your Name")

# Register a model
model = client.register_model(
    name="my-model",
    uri="s3://bucket/path/to/model",
    version="v1.0.0",
    model_format_name="pytorch",
    model_format_version="2.0",
    version_description="My trained model"
)

# Get a registered model
model = client.get_model("my-model")

# List all models
for model in client.list_models():
    print(f"Model: {model.name}")

# List model versions
for version in client.list_model_versions("my-model"):
    print(f"Version: {version.name}")

You can also initialize the client using different port configurations:

ModelRegistryClient("https://example.org", port=456)  # Explicit port argument
ModelRegistryClient("https://example.org:456")        # Port parsed from base_url
ModelRegistryClient("https://example.org")            # Default port (443 for https, 8080 for http)

Local Development

Kubeflow Trainer client supports local development without needing a Kubernetes cluster.

Available Backends

KubernetesBackend (default) - Production training on Kubernetes
ContainerBackend - Local development with Docker/Podman isolation
LocalProcessBackend - Quick prototyping with Python subprocesses

Quick Start: Install container support: pip install kubeflow[docker] or pip install kubeflow[podman]

from kubeflow.trainer import TrainerClient, ContainerBackendConfig, CustomTrainer

# Switch to local container execution
client = TrainerClient(backend_config=ContainerBackendConfig())

# Your training runs locally in isolated containers
job_id = client.train(trainer=CustomTrainer(func=train_fn))

Supported Kubeflow Projects

Project	Status	Version Support	Description
Kubeflow Trainer	✅ Available	v2.0.0+	Train and fine-tune AI models with various frameworks
Kubeflow Katib	✅ Available	v0.19.0+	Hyperparameter optimization
Kubeflow Model Registry	✅ Available	v0.3.0+	Manage model artifacts, versions and ML artifacts metadata
Kubeflow Spark Operator	✅ Available	v2.5.0+	Manage Spark applications for data processing and feature engineering
Kubeflow Pipelines	🚧 Planned	TBD	Build, run, and track AI workflows
Feast	🚧 Planned	TBD	Feature store for machine learning

Community

Getting Involved

Slack: Join our #kubeflow-ml-experience Slack channel
Meetings: Attend the Kubeflow SDK and ML Experience bi-weekly meetings
GitHub: Discussions, issues and contributions at kubeflow/sdk

Contributing

Kubeflow SDK is a community project and is still under active development. We welcome contributions! Please see our CONTRIBUTING Guide for details.

Documentation

Documentation: Kubeflow SDK Official Documentation
Blog Post Announcement: Introducing the Kubeflow SDK: A Pythonic API to Run AI Workloads at Scale
Design Document: Kubeflow SDK design proposal
Component Guides: Individual component documentation
DeepWiki: AI-powered repository documentation

✨ Contributors

We couldn't have done it without these incredible people:

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

andreyvelich

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

This version

0.4.1

Jun 19, 2026

0.4.0

Mar 20, 2026

0.4.0rc0 pre-release

Mar 18, 2026

0.3.1

Apr 10, 2026

0.3.0

Jan 19, 2026

0.2.1

Nov 25, 2025

0.2.0

Nov 6, 2025

0.1.0

Sep 24, 2025

0.1.0rc1 pre-release

Sep 17, 2025

0.0.1rc0 pre-release

Nov 9, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kubeflow-0.4.1.tar.gz (6.3 MB view details)

Uploaded Jun 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kubeflow-0.4.1-py3-none-any.whl (186.7 kB view details)

Uploaded Jun 19, 2026 Python 3

File details

Details for the file kubeflow-0.4.1.tar.gz.

File metadata

Download URL: kubeflow-0.4.1.tar.gz
Upload date: Jun 19, 2026
Size: 6.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kubeflow-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`5c42f622e42589e4ed9622236165448840619e205a95939f8d4237afcaac90bf`
MD5	`d8550e2e180a8434534557feca4590c6`
BLAKE2b-256	`44d9c77e820b34e2e47fcffcacacaf0dca38032a14a7872cc27e6933fe9f57e7`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kubeflow-0.4.1.tar.gz:

Publisher: release.yaml on kubeflow/sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kubeflow-0.4.1.tar.gz
- Subject digest: 5c42f622e42589e4ed9622236165448840619e205a95939f8d4237afcaac90bf
- Sigstore transparency entry: 1871018544
- Sigstore integration time: Jun 19, 2026
Source repository:
- Permalink: kubeflow/sdk@bbb678b857ec0cefe5f20b1c531af47c6891e4eb
- Branch / Tag: refs/heads/main
- Owner: https://github.com/kubeflow
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yaml@bbb678b857ec0cefe5f20b1c531af47c6891e4eb
- Trigger Event: push

File details

Details for the file kubeflow-0.4.1-py3-none-any.whl.

File metadata

Download URL: kubeflow-0.4.1-py3-none-any.whl
Upload date: Jun 19, 2026
Size: 186.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kubeflow-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9f176bba37b56b69b21696c809610dacffb6693926983f8462e8efc68216e22d`
MD5	`3695e91a28219a143ddb90c2e34d4dc6`
BLAKE2b-256	`035833f7ef3686d04e547f67d4ca8e7b7be9f52514f201ab8799555b47fb67d4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for kubeflow-0.4.1-py3-none-any.whl:

Publisher: release.yaml on kubeflow/sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: kubeflow-0.4.1-py3-none-any.whl
- Subject digest: 9f176bba37b56b69b21696c809610dacffb6693926983f8462e8efc68216e22d
- Sigstore transparency entry: 1871018580
- Sigstore integration time: Jun 19, 2026
Source repository:
- Permalink: kubeflow/sdk@bbb678b857ec0cefe5f20b1c531af47c6891e4eb
- Branch / Tag: refs/heads/main
- Owner: https://github.com/kubeflow
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yaml@bbb678b857ec0cefe5f20b1c531af47c6891e4eb
- Trigger Event: push

kubeflow 0.4.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Kubeflow SDK

Overview

Kubeflow SDK Benefits

Kubeflow SDK Introduction

Get Started

Install Kubeflow SDK

Run your first PyTorch distributed job

Optimize hyperparameters for your training

Run data processing with Spark Connect

Manage models with Model Registry

Local Development

Available Backends

Supported Kubeflow Projects

Community

Getting Involved

Contributing

Documentation

✨ Contributors

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance