Set up a remote cluster for computational chemistry
Project description
Cluster Setup :tornado:
This package sets up a remote cluster for computational chemistry workflows. For a more detailed explanation, see below.
Description
In setting up the cluster, this code performs the following steps in order:
-
Create directories for software, support files, and modules in folders specified by the corresponding CLI options
-
Create a Python virtual environment with specified packages
-
Optionally, configure
git(e.g., sign commits with yoursshkey) -
Optionally, clone
gitrepositories for development. -
Write Bash startup files (
.bash_profileand.bashrc). By default, a minimal.bash_profilefile is written that simply checks if~/.bashrcexists and then sources it if it does. The default.bashrcfile defines an alias,activate_env, for activating the Python virtual environment, and adds the module home directory to the Lmod module path. -
Install specified software via "software scripts". See How to Write a Software Script for more details.
-
Creates modulefiles for installed software, virtual environment activation, and scripts passed via the
--module-scriptsoption.
If cluster-setup fails for any reason during execution, all files and folders
created by cluster-setup are removed. (This does not include files and
folders that are created by custom software scripts that reside outside of the
software, support file, and module home directories.) This means that if
any of the software, support file, or module home directories existed and
contained files prior to calling cluster-setup, then these files may be
unintentially deleted. For this reason, it is not recommended to use
existing directories for any of the aforementioned directories.
Quickstart
Requirements
This package requires:
All other requirements are installed by pip. For more details,
see the dependencies and optional-dependencies keys in pyproject.toml.
Installation
cluster-setup can be installed via pip. It is recommended to install
cluster-setup in a fresh Python virtual environment,
python -m venv .venv && source .venv/bin/activate
pip install cluster-setup
It is also recommended to run the tests prior to running the CLI. However, to
do so, you must first install the test extra:
pip install .[test]
Basic Usage
cluster-setup can be called from the command-line via the cluster-setup
command.
cluster-setup <options>
or from within Python code via the function, cluster_setup.main.run.
from cluster_setup.main import run
args = [...]
run(args)
To run the tests prior to running the CLI, use the --test option:
cluster-setup --test
A test report will be written to a text file after the tests run. If any tests fail, please file an issue and attach the report.
[!NOTE] This may take up to a minute.
Options can be supplied to the program either via command-line options
or in a configuration file using the --config-file CLI option. Configuration
files must be in the toml format. A basic configuration
file can be generated with the --config-gen option. Details on writing a
configuration file can be found below.
[!NOTE] Options supplied via the command-line options override those specified via the configuration file (except in the case of
--python-packagesand--requirementswhere option values are combined).
Command-Line Flags
All the command-line flags can be obtained by running cluster-setup --help:
usage: cluster-setup [-h|-v|-V|-g|-t] [more options; see below]
Set up a Digital Research Alliance of Canada cluster account.
General options:
-h, --help Show this help message and exit.
-v, --verbose More verbose messages.
-V, --version Show the program version number and exit.
--dry-run Perform the setup steps for the installation but don't
actually install anything. (Not yet supported)
--check Perform pre-installation checks. This is the default.
Forego checks with '--no-check'.
-t [PYTEST_ARGS ...], --test [PYTEST_ARGS ...]
Run the test suite and exit. Option arguments are forwarded to pytest.
Config file:
Use a config file instead of command line arguments.
This is useful if you are using many flags or want to
transfer a configuration from one cluster to another.
--config-file CONFIG_FILE
Specify a configuration TOML file, must have a [cluster-setup]
or [tool.cluster-setup] section.
-g, --config-gen Generate a configuration file with default values and exit.
Installation directories:
Specify installation directories
--software-home SOFTWARE_HOME
The location in which to install software.
Defaults to ~/software.
--support-file-home SUPPORT_FILE_HOME
The location in which to install support files.
Defaults to ~/support_files.
--module-home MODULE_HOME
The location in which to install modules.
Defaults to ~/modules.
Python virtual environment:
Configure the Python virtual environment
--venv VENV, --python-venv VENV
The name of the Python virtual environment to create.
The environment will be created in a subdirectory relative
to the software home. Defaults to python_venv.
-p PACKAGE, --package PACKAGE
The name of a Python package to install into the virtual
environment. This option may be repeated.
-r PATTERN, --requirements PATTERN
The path to a requirements.txt file specifying Python packages
to install. Relative glob patterns are supported. This option
may be repeated.
-c REPO, --clone-repo REPO
A repository to clone. For example: DOMAIN:USER/REPO.git
This option may be repeated.
Git configuration:
Configure git version control
--git-config-file GIT_CONFIG_FILE
The file to be used to configure git. Defaults to the global
config file.
--git-user-name GIT_USER_NAME
Set your git user name.
--git-email GIT_EMAIL
Set your git user email.
--git-editor GIT_EDITOR
Set your git editor.
--git-rebase-on-pull Rebase git branches when pulling from upstream branch.
Defaults to False.
--sign-with-ssh, --git-sign-with-ssh
Sign git commits with ssh. You must specify a file containing
an ssh public key with the --ssh-key option.
Do not sign commits with ssh with the '--no-sign-with-ssh',
which is the default.
--ssh-key SSH_KEY The file containing your ssh public key. Make sure you have
created an ssh key with 'ssh-keygen' first.
Software:
Configure software installation
--bashrc BASHRC A Jinja2 template file to be used to write the .bashrc file. A
minimal .bashrc file will be written if omitted.
-s SOFTWARE_SCRIPTS, --software-script SOFTWARE_SCRIPTS
Specify software to be installed via scripts and optionally
create modulefiles from templates. See 'Specifying Software Scripts'
below for a detailed description of formatting.
Specifying Software Scripts
---------------------------
Software scripts are specified as five-component, colon-separated strings
structured as
SCRIPT[:[TEMPLATE]:[MODULE]:[VERSION]:[ARGS]]
SCRIPT must be a path to an executable script. TEMPLATE, MODULE,
VERSION, and ARGS are optional. TEMPLATE must point to a Jinja2 template file
for the modulefile; the template context will contain the software and support
file home directories as variables in addition to VERSION. If TEMPLATE is
omitted, then no modulefile will be created. MODULE should be the desired name
of the module. If omitted, then the stem of TEMPLATE will be used. VERSION
should be the version used for the module. If omitted, '0.0.1' is used. ARGS
should be the command line arguments to be passed to the script. The software
and support file home directories and Python environment directory can be
specified using Python template string syntax. For example,
cluster-setup --software-script install_vasp.sh:vasp.j2:vasp:6.3.2:{software_home} vasp.tar.gz
will run the 'install_vasp.sh' script (with the default software home, and
'vasp.tar.gz' as arguments) and create a modulefile (using the 'vasp.j2'
template) for 'vasp' version '6.3.2', and
cluster-setup --software-script install_vasp.sh:vasp.j2:::{python_venv} vasp.tar.gz
will run the install_vasp.sh script (with the Python environment directory and
'vasp.tar.gz' as arguments) and create a modulefile
(using the vasp.j2 template) for vasp version 0.0.1.
How-Tos
Write a Software Script
This guide will describe how to write a software script for use with the
--software-script option. In particular, our script will copy files from
a directory, sources/custom_commands/, to a subdirectory of the software
home directory. We will also specify a generic template that will be used to
write a modulefile for the software.
The --software-script option can be used to install arbitrary software
during the cluster setup process and create a corresponding module. Software
scripts are specified in the format SCRIPT[:[TEMPLATE]:[MODULE]:[VERSION]:[ARGS]].
SCRIPT must point to an executable script.
Software scripts should be written with the understanding that they will be
executed in the same directory from which cluster-setup is called. As an
example, check out the contents of install_custom_scripts.sh below:
#!/usr/bin/bash
software_home=$1
if [[ $software_home = "" ]]; then
echo "Error: No software home directory specified"
exit
fi
if ! test -e "$software_home"; then
echo "Error: Software home directory $software_home does not exist"
exit
fi
# Create software subdirectory
dest=${software_home}/custom_scripts
mkdir "$dest"
# Copy sources into new directory
cp -v sources/custom_scripts/* "$dest"
chmod +x "$dest"/*
This software script will copy several scripts into a subdirectory of the software home directory.
If specified, TEMPLATE must point to a Jinja2 template for the
modulefile. An example of a suitable template (custom_scripts.j2) for the
modulefile corresponding to the software installed by install_custom_scripts.sh
is shown below:
help([[
Custom commands that can be executed from the command-line
]])
whatis("Version: {{ version }}")
prepend_path(PATH, "{{ software_home }}/custom_commands")
Note that {{ version }} and {{ software_home }} will be replaced by the
module version (0.0.1 if not specified) and the software home directory,
respectively.
We have thus defined all necessary components in order for cluster-setup to
install our software correctly. When we eventually call cluster-setup, we
should either specify the following option:
--software-script="install_custom_scripts.sh:custom_scripts.j2:custom_scripts:0.0.1:{software_home}"
or place the following in the configuration file:
...
software_scripts = [
"install_custom_scripts.sh:custom_scripts.j2:custom_scripts:0.0.1:{software_home}",
]
...
Equivalently, we could allow cluster-setup to infer the module name and set
the default version like so from the command-line:
--software-script="install_custom_scripts.sh:custom_scripts.j2:::{software_home}"
or in the configuration file:
...
software_scripts = [
"install_custom_scripts.sh:custom_scripts.j2:::{software_home}",
]
...
Note that arguments must be quoted in order to specify arguments with spaces. For
example, one might pass both the software and support file homes to the
--software-script option like so:
--software-script="install_custom_scripts.sh:custom_scripts.j2:::'{software_home} {support_file_home}'"
or in the configuration file:
...
software_scripts = [
"install_custom_scripts.sh:custom_scripts.j2:::'{software_home} {support_file_home}'",
]
...
Write a Configuration File
cluster-setup configuration files must be written in the TOML format.
Configuration settings can be placed under the cluster-setup table or the
tool.cluster-setup table. The latter choice enables users to include a
configuration from cluster-setup in the pyproject.toml file.
An example configuration file is shown below. Note that the names of the Python
options differ from the CLI options in that they include the prefix
(e.g,. python_requirements).
# config.toml
[cluster-setup]
verbosity = 0
# General Setup
software_home = "/Users/USER/software"
support_file_home = "/Users/USER/support_files"
module_home = "/Users/USER/modules"
# Python
python_venv = "python_venv"
python_packages = [
"pymatgen",
"numpy",
"scipy",
"maggma",
"fireworks",
"matplotlib",
]
python_requirements = "/Users/USER/requirements.txt"
python_repos = [
"github.com:zadorlab/sella.git",
"github.com:cclib/cclib.git",
"gitlab.com:ase/ase.git",
]
# Git
git_user_name = "John Doe"
git_email = "john.doe@example.com"
git_editor = "vi"
git_rebase_on_pull = false
git_sign_with_ssh = true
ssh_key = "/Users/USER/.ssh/id_edcsa.pub"
Tips
Before you call cluster-setup
Prior to execution, ensure that any required modules are loaded. For example,
if you would like to use Python 3.12 for virtual environment creation, ensure
that the appropriate module is loaded. Or if the installation of Python packages
specified with the --packages or --requirements options requires specific
software to be available, ensure that these will be accessible when pip is
called during installation.
--clone-repo
- Setup will fail unless you have read access to the repository
- epositories will be cloned using SSH
- Because repositories will be cloned using SSH, make sure to add
your SSH key to the ssh-agent prior to executing
cluster-setup
--software-script
- If your script requires additional dependencies, ensure that these
are available prior to running
cluster-setupor load them within your script - Note that software scripts must be executable and that the first line of the file must be a shebang.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cluster_setup-0.0.1.tar.gz.
File metadata
- Download URL: cluster_setup-0.0.1.tar.gz
- Upload date:
- Size: 30.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d3438177f0e080817ac2f0ebd22efbb23ce67a6b85bf1c3d8a2d1f53d480fee1
|
|
| MD5 |
5978266491429bd01f8a27971fe2b0ab
|
|
| BLAKE2b-256 |
32f5614fdb17f55f3c9699b857979884bacba608c123b22c0eacdbff0280eab3
|
File details
Details for the file cluster_setup-0.0.1-py3-none-any.whl.
File metadata
- Download URL: cluster_setup-0.0.1-py3-none-any.whl
- Upload date:
- Size: 36.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0fba18077ee33f4933b3a0b9bd9f9ed162650a742bdaf984e6831a2f3e9c87b6
|
|
| MD5 |
3e6d1393214fe6029af9a328c610e7f2
|
|
| BLAKE2b-256 |
3d97786acdd520c93917e338c362d363ad56887a0b94e5ea491c167bedf7b951
|