Skip to main content

CLI and SDK for creating, managing, and scaling Ray clusters on Kubernetes

Project description

Krayne

CLI and SDK for creating, managing, and scaling Ray clusters on Kubernetes.

Krayne wraps the KubeRay operator behind a clean, opinionated interface so ML practitioners can get distributed compute without touching Kubernetes manifests.

A fast and intuitive terminal TUI (Terminal User Interface) is also available.

ikrayne demo

Navigate clusters, create with prefilled forms, scale, delete, and toggle tunnels — all with keyboard shortcuts. See the Interactive TUI guide for details.

Quickstart

pip install krayne

1. Connect Krayne to your Kubernetes cluster

krayne init picks a kubeconfig and context and saves them to ~/.krayne/config.yaml. Run it once after installing — every other command reads from that file.

krayne init

By default this prompts you to choose between ~/.kube/config, the local sandbox kubeconfig, and a custom path. To skip the prompts (e.g. in CI):

krayne init --kubeconfig ~/.kube/config --context my-context

Don't have a Kubernetes cluster handy? Run krayne sandbox setup first to spin up a local k3s cluster with KubeRay pre-installed (Docker required).

2. Create a cluster

Pick whichever entrypoint suits you — they all produce the same Ray cluster. Pass -n/--namespace (or the namespace= field in ClusterConfig) to target a specific Kubernetes namespace; it defaults to default.

CLI:

krayne create my-cluster -n ml-team --gpus-per-worker 1 --workers 2

TUI — press c in the explorer to open the create form:

krayne tui

Python SDK:

from krayne.api import create_cluster
from krayne.config import ClusterConfig, WorkerGroupConfig

config = ClusterConfig(
    name="my-cluster",
    namespace="ml-team",
    worker_groups=[WorkerGroupConfig(replicas=2, gpus=1)],
)
create_cluster(config, wait=True)

3. Run a Ray job against it

Recommended: krayne submit. It opens a tunnel if one isn't already up, then wraps ray job submit so your script's driver runs inside the cluster — no Python version match required, no ray.init glue:

krayne submit demo.py --cluster my-cluster -n ml-team

Add --no-wait to skip log tailing, or pass -- arg1 arg2 … to forward arguments to the script. See docs/reference/cli.md for the full reference.

Advanced: Ray Client (ray.init("ray://…")) — strict version match required

[!WARNING] Ray Client requires the exact same Python major.minor.patch and Ray version on your laptop as in the cluster image. A single patch difference (e.g. 3.12.6 vs 3.12.9) is rejected at handshake. This is a known Ray pain point, not specific to krayne. Only use this path if you've pinned your local interpreter to match rayproject/ray:<ver>-pyXY.

open_tunnel opens port-forward tunnels to the cluster's services so ray.init can reach the head node from your laptop, and closes them on exit:

import ray
from krayne.api import open_tunnel

with open_tunnel("my-cluster", "ml-team") as session:
    ray.init(session.client_url)   # ray://localhost:...

    @ray.remote
    def hello(i: int) -> str:
        return f"Hello from worker {i}"

    print(ray.get([hello.remote(i) for i in range(4)]))
    ray.shutdown()
# tunnels closed when the block exits

When you're done, krayne delete my-cluster -n ml-team (or delete_cluster("my-cluster", "ml-team") from the SDK) tears the cluster down.

Interactive TUI

Krayne includes a k9s-style interactive terminal UI:

krayne tui

Or run it directly without installing: uvx krayne tui

Features

  • Zero-config defaults — every command works with no flags. Sensible defaults get you a working cluster instantly.
  • CLI and SDK — the CLI is a thin shell over the Python SDK. Anything you do from the terminal, you can do from code.
  • Interactive TUI — k9s-style terminal UI for keyboard-driven cluster management.
  • Functional API — stateless free functions, not class hierarchies. Easy to test, easy to compose.
  • Pydantic config — validated configuration with YAML override support. No silent failures.
  • Rich output — beautiful terminal tables via Rich, with --output json for scripting.

CLI Overview

krayne init               Select kubeconfig + context (run once after install)
krayne create <name>      Create a new Ray cluster
krayne get                List clusters in a namespace
krayne describe <name>    Show detailed cluster info
krayne scale <name>       Scale a worker group
krayne delete <name>      Delete a cluster
krayne tui                Launch interactive TUI

All commands support -n/--namespace, --output json, and --debug flags.

Documentation

Full documentation is available at the Krayne docs site.

Requirements

  • Python 3.10+
  • A Kubernetes cluster with the KubeRay operator installed
  • A valid kubeconfig (or running inside the cluster)

Development

# Clone and install
git clone https://github.com/roulbac/krayne.git
cd krayne
uv sync

# Run tests
uv run pytest

# Run integration tests (sandbox is provisioned automatically by test fixtures)
uv run pytest -m integration

Acknowledgements

Krayne is inspired by Spotify-Ray (sp-ray), Spotify's internal platform for running Ray on Kubernetes. The sp-ray team demonstrated that a CLI and SDK with sensible defaults, progressive disclosure of complexity, and managed KubeRay infrastructure can let ML practitioners focus on business logic instead of Kubernetes manifests. Krayne follows this philosophy as an open-source tool for the broader community.

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

krayne-0.3.2.tar.gz (53.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

krayne-0.3.2-py3-none-any.whl (73.9 kB view details)

Uploaded Python 3

File details

Details for the file krayne-0.3.2.tar.gz.

File metadata

  • Download URL: krayne-0.3.2.tar.gz
  • Upload date:
  • Size: 53.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for krayne-0.3.2.tar.gz
Algorithm Hash digest
SHA256 dc23d39ad3fdda1b3a707ff4880c6338b8917ad1c98f39a36149e6402fb7f260
MD5 4aba2ec3e2c9e89cf8f8adc6bb14b6e4
BLAKE2b-256 b6e6c270017e99bd365dba1478ac474972529a958a9337ab342fb9ddaab84723

See more details on using hashes here.

Provenance

The following attestation bundles were made for krayne-0.3.2.tar.gz:

Publisher: publish.yml on roulbac/krayne

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file krayne-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: krayne-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 73.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for krayne-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b5cc1419ca76ad0234a2bdc2138988e016d4709899ad268f228417773d1938a8
MD5 aaa013f877e162244572dbe836b3d97c
BLAKE2b-256 be369888fc5d9c3133a1c6cbbe4138983ec10f7c35b4f752c6be4312b7c1c95d

See more details on using hashes here.

Provenance

The following attestation bundles were made for krayne-0.3.2-py3-none-any.whl:

Publisher: publish.yml on roulbac/krayne

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page