Config-driven CLI to launch, monitor, and ship VLA fine-tunes across ephemeral GPU boxes.
Project description
vlakit
vlakit is a config-driven command-line tool for launching VLA fine-tunes across ephemeral GPU boxes, keeping them alive through crashes, and shipping verified weights to durable storage. You describe a run in YAML and drive everything with a single vla command from your laptop; the actual work runs on a GPU box over ssh, and the durable artifacts land on Weights & Biases and a storage box — never on the GPU box, which you throw away.
It packages a set of battle-tested shell and Python scripts (the ones that encode the hard-won operational lessons) behind a friendly CLI. The scripts ship with the install as read-only package data, while your environment — the boxes, datasets, baselines, and runs — lives in a configs/ directory you own and edit.
Install
vlakit is best installed as an isolated CLI with pipx:
pipx install vlakit
Or with pip. The core install is dependency-light because the laptop-side commands need almost nothing; the heavier pieces are opt-in extras:
pip install vlakit # laptop-side: config / remote / launch
pip install "vlakit[stats]" # adds numpy + pyarrow for `vla stats`
pip install "vlakit[wandb]" # adds wandb for publish / pull / eval logging
pip install "vlakit[all]" # everything
Quickstart
vla init # scaffold an editable ./configs from templates
# edit configs/boxes.yaml, datasets.yaml, baselines.yaml, and a runs/<name>.yaml
# then copy configs/secrets.example.env -> configs/secrets.local.env and fill it
vla config <run> # resolve + print the run config (local, no box)
vla remote <box> deploy # rsync the toolkit + your configs onto the box
vla remote <box> push-secrets # install ~/.secrets.env on the box (mode 600)
vla remote <box> ensure-swap # provision swap (absorbs the checkpoint-save spike)
vla launch <run> # launch detached + auto-resume (box read from the run cfg)
vla remote <box> monitor # step / rate / ETA + liveness
Where each command runs
vlakit keeps a clean split between your laptop and the GPU box. Commands that resolve or inspect configuration — init, config, stats, split, eval, doctor — run entirely on your laptop and need no box. Commands that operate on a machine — everything under vla remote ..., and vla launch — open an ssh connection from your laptop and run the work on the box defined in your boxes.yaml.
Run vla doctor to see exactly which scripts directory and config directory were resolved, and which optional dependencies are installed.
Commands
| Command | Runs | What it does |
|---|---|---|
vla init [dir] |
laptop | Scaffolds an editable configs/ directory from the bundled templates. |
vla config <run> |
laptop | Resolves a run (defaults merged under the run) and prints the config plus the exact command, running nothing. |
vla remote <box> <subcmd> [args] |
box | Runs an operational subcommand on the box: deploy, push-secrets, ensure-swap, launch, autoresume, monitor, kill, rescale, pull, gpus, exec, shell. |
vla launch <run> |
box | Launches the run detached and auto-resuming; the box is read from the run's box: field. |
vla stats [args] |
laptop/box | Computes the full dataset statistics (quantiles + image stats) that lerobot and molmo need. Requires the [stats] extra. |
vla split [args] |
laptop | Produces a deterministic held-out episode split for validation/eval. |
vla eval [args] |
laptop/box | Ranks a checkpoint by held-out error or rollout, not loss. Try vla eval --self-test to verify the harness with no box. |
vla doctor |
laptop | Prints the resolved scripts/config directories and optional-dependency status. |
Configuration
Your configs/ directory holds everything dynamic, and no secrets ever live in it: keys resolve on the box via ~/.secrets.env. The directory is resolved from --config-dir, then the VLA_CONFIG_DIR environment variable, then ./configs. A run file under runs/ is a thin recipe that names a box, a dataset, and a baseline — each a pointer into the corresponding registry — plus a few hyperparameters; everything else is inherited from _defaults.yaml.
Status
The local commands (init, config, stats, split, eval, doctor) are implemented and tested. The remote commands shell out to the bundled, proven ops scripts; vla remote <box> deploy now ships both the toolkit and your configs/ to the box. The eval offline comparator is implemented and self-tested (vla eval --self-test); its sim and robot rollout modes are still stubs.
Publishing (maintainers)
Releases publish to PyPI automatically through Trusted Publishing (OIDC), so no API token is stored anywhere. The workflow is .github/workflows/release.yml.
One-time setup:
- On PyPI, add a pending Trusted Publisher (Account → Publishing) with these exact values:
- PyPI Project Name:
vlakit - Owner:
kkipngenokoech - Repository name:
vlakit - Workflow name:
release.yml - Environment name:
pypi
- PyPI Project Name:
- In the GitHub repo, create an Environment named
pypi(Settings → Environments).
To cut a release, tag a version that matches pyproject.toml and push it:
git tag v0.1.0
git push origin v0.1.0
The workflow builds the sdist + wheel, verifies the bundled scripts/templates are inside the wheel, checks the tag matches the package version, and publishes. After the first successful run the pending publisher becomes a normal one, and pipx install vlakit works for everyone.
License
MIT — see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vlakit-0.1.0.tar.gz.
File metadata
- Download URL: vlakit-0.1.0.tar.gz
- Upload date:
- Size: 60.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e4046089974e48de7dc3163d21b7c3cc4f0b4a9c20e8c475f65c6b6ee8d17ca
|
|
| MD5 |
2f42c937a189264ecb698b370b791a67
|
|
| BLAKE2b-256 |
fc35ce8b433097960399b33df3e0a9b0bb292a5d87ceae6f4ab3135dde480872
|
Provenance
The following attestation bundles were made for vlakit-0.1.0.tar.gz:
Publisher:
release.yml on kkipngenokoech/vlakit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vlakit-0.1.0.tar.gz -
Subject digest:
1e4046089974e48de7dc3163d21b7c3cc4f0b4a9c20e8c475f65c6b6ee8d17ca - Sigstore transparency entry: 1928677908
- Sigstore integration time:
-
Permalink:
kkipngenokoech/vlakit@b9fe7b62ec9ee1269c491174aeda704d5e1f4596 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kkipngenokoech
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b9fe7b62ec9ee1269c491174aeda704d5e1f4596 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vlakit-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vlakit-0.1.0-py3-none-any.whl
- Upload date:
- Size: 87.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
97858e11830273e689705a8e81d48160fc426489b820c0638aa9058ccaac330a
|
|
| MD5 |
1028eef044807eca1191569d940a149c
|
|
| BLAKE2b-256 |
5c4d73467f8e0015a817736dd678687d45b8dac3fa182be41b1a97d1b8342d4f
|
Provenance
The following attestation bundles were made for vlakit-0.1.0-py3-none-any.whl:
Publisher:
release.yml on kkipngenokoech/vlakit
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vlakit-0.1.0-py3-none-any.whl -
Subject digest:
97858e11830273e689705a8e81d48160fc426489b820c0638aa9058ccaac330a - Sigstore transparency entry: 1928678024
- Sigstore integration time:
-
Permalink:
kkipngenokoech/vlakit@b9fe7b62ec9ee1269c491174aeda704d5e1f4596 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/kkipngenokoech
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@b9fe7b62ec9ee1269c491174aeda704d5e1f4596 -
Trigger Event:
push
-
Statement type: