Skip to main content

Kubetorch: A Kubernetes-native framework for distributed PyTorch workloads

Project description

📦Kubetorch🔥

A Python interface for running ML workloads on Kubernetes

Kubetorch enables you to run any Python code on Kubernetes at any scale by specifying required resources, distribution, and scaling directly in code. It provides caching and hot redeployment for 1-2 second iteration cycles, handles hardware faults and preemptions programmatically, and orchestrates complex, heterogeneous workloads with built-in observability and fault tolerance.

Hello World

from kubetorch import fn

def hello_world():
    return "Hello from Kubetorch!"

if __name__ == "__main__":
    # Define your compute
    compute = kt.Compute(cpus=".1")

    # Send local function to freshly launched remote compute
    remote_hello = kt.fn(hello_world).to(compute)

    # Runs remotely on your Kubernetes cluster
    result = hello_world()
    print(result)  # "Hello from Kubetorch!"

What Kubetorch Enables

  • 100x faster iteration from 10+ minutes to 1-3 seconds for complex ML applications like RL and distributed training
  • 50%+ compute cost savings through intelligent resource allocation, bin-packing, and dynamic scaling
  • 95% fewer production faults with built-in fault handling with programmatic error recovery and resource adjustment

Installation

1. Python Client

pip install "kubetorch[client]"

2. Kubernetes Deployment (Helm)

# Option 1: Install directly from OCI registry
helm upgrade --install kubetorch oci://ghcr.io/run-house/charts/kubetorch \
  --version 0.2.0 -n kubetorch --create-namespace

# Option 2: Download chart locally first
helm pull oci://ghcr.io/run-house/charts/kubetorch --version 0.2.0 --untar
helm upgrade --install kubetorch ./kubetorch -n kubetorch --create-namespace

For detailed setup instructions, see our Installation Guide.

Learn More


Apache 2.0 License

🏃‍♀️ Built by Runhouse 🏠

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kubetorch-0.2.0.tar.gz (189.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

kubetorch-0.2.0-py3-none-any.whl (229.1 kB view details)

Uploaded Python 3

File details

Details for the file kubetorch-0.2.0.tar.gz.

File metadata

  • Download URL: kubetorch-0.2.0.tar.gz
  • Upload date:
  • Size: 189.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.11.9 Darwin/24.6.0

File hashes

Hashes for kubetorch-0.2.0.tar.gz
Algorithm Hash digest
SHA256 480a433a3ae4913c0b21bc4ed786ae77cafd62d2ae65f162fe8c3fa2e4d108aa
MD5 abf68a7c29bfbc5c5c595c5d4ab7c216
BLAKE2b-256 822aee98dba12912c6dd6f83450919c51948cdc2fff5e44698ce60227f404d70

See more details on using hashes here.

File details

Details for the file kubetorch-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: kubetorch-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 229.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.11.9 Darwin/24.6.0

File hashes

Hashes for kubetorch-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 777cda8c0e2e2b3b691f84d1f01cbc5414d85610c06269303c44868431363e84
MD5 a8f6fafffb9e615fcbb5703e7056ec72
BLAKE2b-256 3d38bd4f68d598e56b99094c2db0615648e3b37d4760f44c542670840e15e93f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page