CLI for running GPU workloads, managing remote workspaces, and evaluating/optimizing kernels

Project description

Wafer CLI

Run GPU workloads, optimize kernels, and query GPU documentation.

Getting Started

# Install
cd apps/wafer-cli && uv sync

# Use staging (workspaces and other features require staging)
wafer config set api.environment staging

# Login
wafer login

# Run a command on a remote GPU
wafer remote-run -- nvidia-smi

Commands

`wafer login` / `wafer logout` / `wafer whoami`

Authenticate with GitHub OAuth.

wafer login          # Opens browser for GitHub OAuth
wafer whoami         # Show current user
wafer logout         # Remove credentials

`wafer remote-run`

Run any command on a remote GPU.

wafer remote-run -- nvidia-smi
wafer remote-run --upload-dir ./my_code -- python3 train.py

`wafer workspaces`

Create and manage persistent GPU environments.

Available GPUs:

MI300X - AMD Instinct MI300X (192GB HBM3, ROCm)
B200 - NVIDIA Blackwell B200 (180GB HBM3e, CUDA) - default

wafer target workspace list
wafer target workspace create my-workspace --gpu B200 --wait   # NVIDIA B200
wafer target workspace create amd-dev --gpu MI300X             # AMD MI300X
wafer target workspace ssh <workspace-id>
wafer target workspace delete <workspace-id>

`wafer agent`

AI assistant for GPU kernel development. Helps with CUDA/Triton optimization, documentation queries, and performance analysis.

wafer agent "What is TMEM in CuTeDSL?"
wafer agent -s "optimize this kernel" < kernel.py

`wafer tool eval`

Evaluate kernel correctness and performance against a reference implementation.

Functional format (default):

# Generate template files
wafer tool eval make-template ./my-kernel

# Run evaluation
wafer tool eval gpumode --impl kernel.py --reference ref.py --test-cases tests.json --benchmark

The implementation must define custom_kernel(inputs), the reference must define ref_kernel(inputs) and generate_input(**params).

KernelBench format (ModelNew class):

# Extract a KernelBench problem as template
wafer tool eval kernelbench make-template level1/1

# Run evaluation
wafer tool eval kernelbench --impl my_kernel.py --reference problem.py --benchmark

The implementation must define class ModelNew(nn.Module), the reference must define class Model, get_inputs(), and get_init_inputs().

`wafer agent -t ask-docs`

Query GPU documentation using the docs template. Uses the ask_docs tool to search wafer's documentation corpus via the API.

wafer agent -t ask-docs -s "What causes bank conflicts in shared memory?"

Customization

`wafer tool eval` options

wafer tool eval gpumode --impl k.py --reference r.py --test-cases t.json \
    --target vultr-b200 \    # Specific GPU target
    --benchmark \            # Measure performance
    --profile                # Enable torch.profiler + NCU

Profile analysis

wafer tool ncu analyze profile.ncu-rep
wafer tool nsys analyze profile.nsys-rep

Advanced

Local targets

Bypass the API and SSH directly to your own GPUs:

wafer target config list
wafer target config add ./my-gpu.toml
wafer target config default my-gpu

Defensive evaluation

Detect evaluation hacking (stream injection, lazy evaluation, etc.):

wafer tool eval gpumode --impl k.py --reference r.py --test-cases t.json --benchmark --defensive

Other tools

wafer tool perfetto <trace.json> --query "SELECT * FROM slice"   # Perfetto SQL queries
wafer tool capture ./script.py                                    # Capture execution snapshot
wafer compiler-analyze kernel.ptx                                 # Analyze PTX/SASS

ROCm profiling (AMD GPUs)

wafer tool rocprof-sdk ...
wafer tool rocprof-systems ...
wafer tool rocprof-compute ...

Shell Completion

Enable tab completion for commands, options, and target names:

# Install completion (zsh/bash/fish)
wafer --install-completion

# Then restart your terminal, or source your shell config:
source ~/.zshrc  # or ~/.bashrc

Now you can tab-complete:

Commands: wafer tool ev<TAB> → wafer tool eval
Options: wafer tool eval --<TAB>
Target names: wafer tool eval --target v<TAB> → wafer tool eval --target vultr-b200
File paths: wafer tool eval gpumode --impl ./<TAB>

AI Assistant Skills

Install the Wafer CLI skill to make wafer commands discoverable by your AI coding assistant:

# Install for all supported tools (Claude Code, Codex CLI, Cursor)
wafer skill install

# Install for a specific tool
wafer skill install -t cursor    # Cursor
wafer skill install -t claude    # Claude Code
wafer skill install -t codex     # Codex CLI

# Check installation status
wafer skill status

# Uninstall
wafer skill uninstall

Installing from GitHub (Cursor)

You can also install the skill directly from GitHub in Cursor:

Open Cursor Settings (Cmd+Shift+J / Ctrl+Shift+J)
Navigate to Rules → Add Rule → Remote Rule (Github)
Enter: https://github.com/wafer-ai/skills
Cursor will automatically discover skills in .cursor/skills/

The skill provides comprehensive guidance for GPU kernel development, including documentation lookup, trace analysis, kernel evaluation, and optimization workflows.

Requirements

Python 3.10+
GitHub account (for authentication)

Project details

Release history Release notifications | RSS feed

0.2.64

Feb 14, 2026

0.2.63

Feb 14, 2026

0.2.62

Feb 14, 2026

0.2.61

Feb 13, 2026

0.2.60

Feb 13, 2026

0.2.59

Feb 13, 2026

This version

0.2.58

Feb 13, 2026

0.2.57

Feb 13, 2026

0.2.56

Feb 13, 2026

0.2.55

Feb 13, 2026

0.2.54

Feb 13, 2026

0.2.53

Feb 12, 2026

0.2.52

Feb 11, 2026

0.2.51

Feb 10, 2026

0.2.50

Feb 4, 2026

0.2.49

Feb 3, 2026

0.2.48

Feb 3, 2026

0.2.47

Feb 3, 2026

0.2.46

Feb 3, 2026

0.2.45

Feb 3, 2026

0.2.44

Feb 3, 2026

0.2.43

Feb 3, 2026

0.2.42

Feb 3, 2026

0.2.41

Feb 3, 2026

0.2.40

Feb 3, 2026

0.2.39

Feb 3, 2026

0.2.38

Jan 30, 2026

0.2.37

Jan 30, 2026

0.2.36

Jan 30, 2026

0.2.35

Jan 30, 2026

0.2.34

Jan 30, 2026

0.2.33

Jan 30, 2026

0.2.32

Jan 29, 2026

0.2.31

Jan 29, 2026

0.2.30

Jan 28, 2026

0.2.29

Jan 28, 2026

0.2.28

Jan 28, 2026

0.2.27

Jan 28, 2026

0.2.26

Jan 28, 2026

0.2.25

Jan 28, 2026

0.2.24

Jan 27, 2026

0.2.23

Jan 27, 2026

0.2.22

Jan 27, 2026

0.2.21

Jan 27, 2026

0.2.20

Jan 26, 2026

0.2.19

Jan 26, 2026

0.2.18

Jan 26, 2026

0.2.17

Jan 26, 2026

0.2.16

Jan 26, 2026

0.2.15

Jan 26, 2026

0.2.14

Jan 25, 2026

0.2.13

Jan 25, 2026

0.2.12

Jan 25, 2026

0.2.11

Jan 25, 2026

0.2.10

Jan 25, 2026

0.2.9

Jan 23, 2026

0.2.8

Jan 23, 2026

0.2.7

Jan 23, 2026

0.2.6

Jan 22, 2026

0.2.5

Jan 21, 2026

0.2.4

Jan 21, 2026

0.2.3

Jan 21, 2026

0.2.2

Jan 20, 2026

0.2.1

Jan 20, 2026

0.1.2

Jan 14, 2026

0.1.1

Jan 13, 2026

0.1.0

Jan 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wafer_cli-0.2.58.tar.gz (324.3 kB view details)

Uploaded Feb 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wafer_cli-0.2.58-py3-none-any.whl (279.8 kB view details)

Uploaded Feb 13, 2026 Python 3

File details

Details for the file wafer_cli-0.2.58.tar.gz.

File metadata

Download URL: wafer_cli-0.2.58.tar.gz
Upload date: Feb 13, 2026
Size: 324.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for wafer_cli-0.2.58.tar.gz
Algorithm	Hash digest
SHA256	`aaee77dbe488c95846c59847ee212165423844d9ce1bdbd8f02029dbb8027ddd`
MD5	`0adf859f80342a510dc8c3cbbe0d4a00`
BLAKE2b-256	`dd0c0da1fe8315925ddd7c7314290e8556c8b0a7e84b648a370c02b135a52126`

See more details on using hashes here.

File details

Details for the file wafer_cli-0.2.58-py3-none-any.whl.

File metadata

Download URL: wafer_cli-0.2.58-py3-none-any.whl
Upload date: Feb 13, 2026
Size: 279.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for wafer_cli-0.2.58-py3-none-any.whl
Algorithm	Hash digest
SHA256	`73fe612100a74678f7dc7fd88d309432d4e6ab24d0b5d42d9e1facebaa7a8778`
MD5	`e2ceebf04191974941f7aa12cf014d55`
BLAKE2b-256	`f25545da189fd58e9f225fef8eda490f5850b97c4143b4b41404186c9d34acd4`

See more details on using hashes here.

wafer-cli 0.2.58

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Wafer CLI

Getting Started

Commands

wafer login / wafer logout / wafer whoami

wafer remote-run

wafer workspaces

wafer agent

wafer tool eval

wafer agent -t ask-docs

Customization

wafer tool eval options

Profile analysis

Advanced

Local targets

Defensive evaluation

Other tools

ROCm profiling (AMD GPUs)

Shell Completion

AI Assistant Skills

Installing from GitHub (Cursor)

Requirements

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`wafer login` / `wafer logout` / `wafer whoami`

`wafer remote-run`

`wafer workspaces`

`wafer agent`

`wafer tool eval`

`wafer agent -t ask-docs`

`wafer tool eval` options