ML experiment launcher for local, SLURM, and SSH environments

These details have not been verified by PyPI

Project description

Chester

Chester (chester-ml on PyPI) is a Python experiment launcher for ML workflows. Define your training function and parameter sweep — Chester handles dispatching jobs to local subprocesses, SSH servers, or SLURM clusters, with Singularity container support, code syncing, and reproducibility snapshots baked in.

Installation

pip install chester-ml
# or
uv add chester-ml

Quick Start

1. Create .chester/config.yaml in your project root:

log_dir: data
package_manager: uv

backends:
  local:
    type: local
    prepare: .chester/backends/local/prepare.sh

  myserver:
    type: ssh
    host: myserver                       # SSH alias from ~/.ssh/config
    remote_dir: /home/user/myproject
    prepare: .chester/backends/myserver/prepare.sh

  mycluster:
    type: slurm
    host: mycluster
    remote_dir: /home/user/myproject
    prepare: .chester/backends/mycluster/prepare.sh
    slurm:
      partition: gpu
      time: "24:00:00"
      gpus: 1
      cpus_per_gpu: 8
      mem_per_gpu: 32G

2. Write a launcher:

from chester.run_exp import run_experiment_lite, VariantGenerator, detect_local_gpus, flush_backend

def run_task(variant, log_dir, exp_name):
    print(f"lr={variant['lr']}, batch={variant['batch_size']}")
    # ... your training code ...

vg = VariantGenerator()
vg.add('lr', [1e-3, 1e-4])
vg.add('batch_size', [32, 64])

for v in vg.variants():
    run_experiment_lite(
        stub_method_call=run_task,
        variant=v,
        mode='local',        # or 'myserver', 'mycluster'
        exp_prefix='sweep',
        max_num_processes=max(1, len(detect_local_gpus())),
    )

flush_backend('local')       # no-op for local; required after loop for batch SSH mode

3. Run:

python launcher.py           # local
python launcher.py myserver  # SSH
python launcher.py mycluster # SLURM

Features

Three backend types: local subprocess, SSH (nohup), SLURM (sbatch)
Singularity on all backends: GPU passthrough, persistent overlays, per-container prepare.sh
VariantGenerator: cartesian product sweeps, dependent parameters, order="serial" (multi-step single job) and order="dependent" (chained SLURM jobs)
Hydra integration: pass parameters as key=value overrides with OmegaConf interpolation support
Git snapshot: saves git_info.json + git_diff.patch per run for full reproducibility
Submodule commit pinning: pin specific submodule commits per job via remote git worktrees
SSH batch-GPU mode: accumulate jobs across variants, fire one per GPU on flush_backend()
Extra sync dirs: rsync additional paths (datasets, checkpoints) to remote before submission
Per-experiment SLURM overrides: tune time, gpus, mem_per_gpu, etc. per run_experiment_lite() call
Graceful Ctrl+C: local kills subprocesses and stops the queue; remote detaches and lets jobs keep running

Documentation

Full reference in docs/:

Doc	What it covers
Configuration	`.chester/config.yaml` — all fields, global singularity block, YAML anchors
Backends	Local, SSH, SLURM — all options, batch-GPU, extra sync dirs
Singularity	Mounts, overlays, PID namespace, fakeroot, runtime override
Parameter Sweeps	VariantGenerator, serial/dependent ordering, derive, flush_backend
Hydra	`hydra_enabled`, flags, OmegaConf interpolations
Git Snapshot	`git_info.json`, `git_diff.patch`, submodule tracking, recovery
Submodule Pinning	Per-job submodule commit pinning via worktrees
Examples	Annotated real-world config patterns

Example Configs

See docs/examples/ for annotated configs:

simple.yaml — local + SSH + SLURM, no Singularity
singularity-slurm.yaml — production SLURM + Singularity with NFS mounts
multi-gpu-ssh.yaml — multi-GPU SSH workstation with batch mode

Project Layout

myproject/
├── .chester/
│   ├── config.yaml                    # Main config
│   └── backends/
│       ├── local/
│       │   └── prepare.sh             # Local env setup
│       ├── mycluster/
│       │   └── prepare.sh             # Cluster setup (modules, paths)
│       └── myserver/
│           └── prepare.sh             # SSH server setup
├── launchers/
│   └── launch_sweep.py
└── src/

Chester searches for .chester/config.yaml upward from the current directory, stopping at the .git root. Override with $CHESTER_CONFIG_PATH.

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

2.0.0

Apr 26, 2026

0.2.1

Jan 15, 2026

0.2.0

Jan 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chester_ml-2.0.0.tar.gz (401.4 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chester_ml-2.0.0-py3-none-any.whl (52.7 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file chester_ml-2.0.0.tar.gz.

File metadata

Download URL: chester_ml-2.0.0.tar.gz
Upload date: Apr 26, 2026
Size: 401.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for chester_ml-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`07b6a90e86a7b569cf1d0fd515c3232f3beacc95c66734ae56146e74c0ca9b43`
MD5	`5616b90bda1a3862c4773f759cd570c8`
BLAKE2b-256	`6dd03ac5e2c134dfdb56139abda25dfac86fd20a5f95e5f68387a50bf1faeff3`

See more details on using hashes here.

File details

Details for the file chester_ml-2.0.0-py3-none-any.whl.

File metadata

Download URL: chester_ml-2.0.0-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 52.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for chester_ml-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`80c7a0cc765cb68076da0605ecbada7b4b1253d95fc854616f3861c06b9ba615`
MD5	`e875d0d2b9a277c2ef77dc8fbff737fe`
BLAKE2b-256	`cb6100805333fefde096098f8d6779dcb87094fb97df89f50428f05960dae88b`

See more details on using hashes here.

chester-ml 2.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Chester

Installation

Quick Start

Features

Documentation

Example Configs

Project Layout

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes