LMN: A simple launcher
Project description
🍋 LMN - A minimal launcher
A lightweight tool to run your local project on a remote machine, with a single command.
lmn
can set up a container environment (Docker or Singularity) and can work with job schedulers (Slurm or PBS).
🧑💻 lmn
in action:
In your project directory, you can simply run
$ lmn run <remote-machine> -- python generate_fancy_images.py
It does:
- rsync your local project directory with the remote machine
- ssh into the remote machine, (optionally) set up the specified env (Docker or Singularity)
- Inside of the environment,
lmn
runspython generate_fancy_images.py
- If files are created (such as images),
lmn
rsync the generated files back to the local project directory
🚀 Quickstart
▶️ Installation
You can install lmn
via pip:
$ pip install lmn
▶️ Configuration
All you need is to place a single configuration file at project_dir/.lmn.json5
(on your local machine).
The config file looks like:
Example Configuration
{
"project": {
"name": "my_project",
// What not to rsync with the remote machine:
"exclude": [".git", ".venv", "wandb", "__pycache__"],
// Project-specific environment variables:
"environment": {
"MUJOCO_GL": "egl"
}
},
"machines": {
"elm": {
// Host information
"host": "elm.ttic.edu",
"user": "takuma",
// Rsync target directory (on the remote host)
"root_dir": "/scratch/takuma/lmn",
// Mode: ["ssh", "docker", "slurm", "pbs", "slurm-sing", "pbs-sing"]
"mode": "docker",
// Docker configurations
"docker": {
"image": "ripl/my_transformer:latest",
"network": "host",
// Mount configurations (host -> container)
"mount_from_host": {
"/ripl/user/takuma/project/": "/project",
"/dev/shm": "/dev/shm",
},
},
// Host-specific environment variables
"environment": {
"PROJECT_DIR": "/project",
},
},
"tticslurm": {
"host": "slurm.ttic.edu",
"user": "takuma",
"mode": "slurm-sing", // Running a Singularity container on a cluster with Slurm job scheduler
"root_dir": "/share/data/ripl-takuma/lmn",
// Slurm job configurations
"slurm": {
"partition": "contrib-gpu",
"cpus_per_task": 1,
"time": "04:00:00",
"output": "slurm-%j.out.log",
"error": "slurm-%j.error.log",
"exclude": "gpu0,gpu18",
},
// Singularity configurations
"singularity": {
"sif_file": "/share/data/ripl-takuma/singularity/my_transformer.sif",
"writable_tmpfs": true,
"startup": "ldconfig /.singularity.d/libs", // Command to run after starting up the container
"mount_from_host": {
"/share/data/ripl-takuma/project/": "/project",
},
},
"environment": {
"PROJECT_DIR": "/project",
}
}
}
}
More example configurations can be found in the example directory.
▶️ Command examples
Make sure that you're in the project directory first.
# Launch an interactive shell in the docker container (on elm):
$ lmn run elm -- bash
# Run a job in the docker container (on elm):
$ lmn run elm -- python train.py
# Run a script on the host (on elm):
$ lmn run elm --mode ssh -- python hello.py
# Run a command quickly on the host without syncing any files ("bare"-run; on elm)
$ lmn brun elm -- hostname
# Check GPU usage on elm (This is equivalent to `lmn brun elm -- nvidia-smi`)
$ lmn nv elm
# Launch an interactive shell in the Singularity container via Slurm scheduler (on tticslurm)
$ lmn run tticslurm -- bash
# Run a job in the Singularity container via Slurm scheduler (on tticslurm)
$ lmn run tticslurm -- python train.py
# Submit a batch job that runs in the Singularity container via Slurm scheduler (on tticslurm)
$ lmn run tticslurm -d -- python train.py
# Launching a sweep (batch jobs) that runs in the Singularity container via Slurm scheduler (on tticslurm)
# This submits 10 batch jobs where `$LMN_RUN_SWEEP_IDX` is set from 0 to 9.
$ lmn run tticslurm --sweep 0-9 -d -- python train.py -l '$LMN_RUN_SWEEP_IDX'
# Run a script on the login node (on tticslurm)
$ lmn run tticslurm --mode ssh -- squeue -u takuma
# Get help
$ lmn --help
# Get help on `lmn run`
$ lmn run --help
More about `--sweep` format
--sweep 0-9
: ten jobs withLMN_RUN_SWEEP_IDX=0
,1
through9
- Internally
lmn
simply runsrange(0, 9 + 1)
- Internally
--sweep 7
: a single job withLMN_RUN_SWEEP_IDX=7
--sweep 3,5,8
: three jobs withLMN_RUN_SWEEP_IDX=3
and5
and8
Comparison with other packages
lmn
is inspired by the following great packages: geyang/jaynes, justinjfu/doodad and afdaniele/cpk.
jaynes
anddoodad
focus on launching a lot of jobs in non-interactive mode- ✅ support ssh, docker, slurm and AWS / GCP (but not PBS scheduler)
- 😢 do not support Singularity
- 😢 cannot launch interactive jobs
- 😢 only work with Python project, and require (although small) modifications to the project codebase
cpk
focuses on (though not limited to) ROS application and running programs in docker containers- ✅ supports X forwarding and more stuff that are helpful to run ROS applications on the container
- ✅ provides more functionalities such as creating and deploying ssh keys on remote machines
- 😢 does not support clusters with schedulers (Slurm or PBS), nor does it support Singularity
Tasks
- Use Pydantic for configurations
- add ssh-release subcommand (to drop the persistent ssh connection)
- Remove project.name from the global config, and get project name from the project root directory name
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lmn-0.5.0.tar.gz
.
File metadata
- Download URL: lmn-0.5.0.tar.gz
- Upload date:
- Size: 30.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae6ea1a939e161cfce3940fee45245a34ae11fb173209035267cad9d16a15aed |
|
MD5 | 576da3a672d5ac852186bd8d16685acf |
|
BLAKE2b-256 | e309224b7c6898ace16b3a0183fe8fc699a4c3744ac8f7826db7c68a105981be |
File details
Details for the file lmn-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: lmn-0.5.0-py3-none-any.whl
- Upload date:
- Size: 35.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e8b1b852406d3f1032f284581aa2ad79caff1c80f0e80a93b610d60b79bfee3a |
|
MD5 | e9028ae77e474ef0e58c9076e00d0a9b |
|
BLAKE2b-256 | e1a49a2989ecc1c874267436d7020ddf25d9a893e557bc06f25abd77610e6f25 |