Skip to main content

Slurm Experiment Management Library

Project description

Github Actions

SEML: Slurm Experiment Management Library

SEML is the missing link between the open-source workload scheduling system Slurm, the experiment management tool sacred, and a MongoDB experiment database. It is lightweight, hackable, written in pure Python, and scales to thousands of experiments.

Keeping track of computational experiments can be annoying and failure to do so can lead to lost results, duplicate running of the same experiments, and lots of headaches. While workload scheduling systems such as Slurm make it easy to run many experiments in parallel on a cluster, it can be hard to keep track of which parameter configurations are running, failed, or completed. sacred is a great tool to collect and manage experiments and their results, especially when used with a MongoDB. However, it is lacking integration with workload schedulers.

SEML enables you to

  • very easily define hyperparameter search spaces using YAML files,
  • run these hyperparameter configurations on a compute cluster using Slurm,
  • and to track the experimental results using sacred and MongoDB.

In addition, SEML offers many more features to make your life easier, such as

  • automatically saving and loading your source code for reproducibility,
  • easy debugging on Slurm or locally,
  • automatically checking your experiment configurations,
  • extending Slurm with local workers,
  • and keeping track of resource usage (experiment runtime, RAM, etc.).

Get started

New projects

The fastest way to get started with SEML is via uv:

  1. Install uv:
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  2. Setup a new project
    # uvx will execute `SEML` in a temporary virtual environment
    # and run it to setup your new project.
    uvx seml project init my_new_project
    
  3. Setup a virtual environment
    cd my_new_project
    uv sync
    
  4. Activate your virtual environment
    source .venv/bin/activate
    
  5. Configure SEML:
    seml configure
    

When executing SEML make sure to always use the seml command from your project's virtual environment and only use uvx seml for high-level commands that do not affect experiments (like setting up new projects).

Existing projects

If you want to include SEML into existing projects, you can install it via:

pip install seml

Then configure your MongoDB via:

seml configure

SSH Port Forwarding

If your MongoDB is only accessible via an SSH port forward, SEML allows you to directly configure this as well if you install the ssh_forward dependencies via:

pip install seml[ssh_forward]

It remains to configure the SSH settings:

seml configure --ssh_forward

Development

For development, we recommend uv which you can install via

curl -LsSf https://astral.sh/uv/install.sh | sh

Setup the right environment use and activate it:

uv sync --locked
source .venv/bin/activate

Alternatively, you can install the repository in any Python environment via:

pip install -e .[dev]

Pre-commit hooks

Make sure to install the pre-commit hooks via

pre-commit install

Documentation

Documentation is available in our docs.md or via the CLI:

seml --help

Example

See our simple example to get familiar with how SEML works.

CLI completion

SEML supports command line completion. To install this feature run:

seml --install-completion {shell}

If you are using the zsh shell, you might have to append compinit -D to the ~/.zshrc file (see this issue).

Slurm version

SEML should work with Slurm 18.08 and above out of the box. Version 17.11 and earlier do not have a SIGNALING job state, which you have to remove from the SLURM_STATES defined in SEML's settings (seml/settings.py). Earlier versions have not been tested and might have other issues.

Contact

Contact us at zuegnerd@in.tum.de, johannes.gasteiger@tum.de, or n.gao@tum.de for any questions.

Cite

When you use SEML in your own work, please cite the software along the lines of the following bibtex:

@software{seml_2023,
  author = {Z{\"u}gner, Daniel and Gasteiger, Johannes and Gao, Nicholas and Dominik Fuchsgruber},
  title = {{SEML: Slurm Experiment Management Library}},
  url = {https://github.com/TUM-DAML/seml},
  version = {0.4.0},
  year = {2023}
}

Copyright (C) 2023 Daniel Zügner, Johannes Gasteiger, Nicholas Gao, Dominik Fuchsgruber Technical University of Munich

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seml-0.5.4.tar.gz (475.0 kB view details)

Uploaded Source

Built Distribution

seml-0.5.4-py3-none-any.whl (120.4 kB view details)

Uploaded Python 3

File details

Details for the file seml-0.5.4.tar.gz.

File metadata

  • Download URL: seml-0.5.4.tar.gz
  • Upload date:
  • Size: 475.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.13.0

File hashes

Hashes for seml-0.5.4.tar.gz
Algorithm Hash digest
SHA256 f9e725c73fc13e488f14c052951905ac770fc8932eb543944f9cc8dbd786bd57
MD5 bfe6653ac51e3fe7c36e9fd24515899a
BLAKE2b-256 93748d49f675367af76810cdca88531e73d5f718a522037b2d3521fe8cc7aad5

See more details on using hashes here.

File details

Details for the file seml-0.5.4-py3-none-any.whl.

File metadata

  • Download URL: seml-0.5.4-py3-none-any.whl
  • Upload date:
  • Size: 120.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.13.0

File hashes

Hashes for seml-0.5.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c72323caed71804660c8c5c9c9323af78fdf81a3d002cead78c6cc8b69d2ec24
MD5 89e4ecf3ad9eac342d6063420844ddb5
BLAKE2b-256 9bcf318205a043a5d7d25a69a56d2da7cb33fd309a623d29cf67ad9f219d5940

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page