Build large-scale task workflows using luigi, remote job submission, remote targets, and environment
Project description
[!NOTE]
This project is currently under development.
- Version 1.0.0 will support Python ≥3.7 and is developed in the release_prep branch.
- There will be a legacy branch with Python 2.7 and ≤3.6 support that, however, won't receive any new features.
- The release is targetted for spring 2024.
Use law to build complex and large-scale task workflows. It is build on top of luigi and adds abstractions for run locations, storage locations and software environments. Law strictly disentangles these building blocks and ensures they remain interchangeable and resource-opportunistic.
Key features:
- CLI with auto-completion and interactive status and dependency inspection.
- Remote targets with automatic retries and local caching
- WebDAV, HTTP, Dropbox, SFTP, all WLCG protocols (srm, xrootd, dcap, gsiftp, webdav, ...)
- Automatic submission to batch systems from within tasks
- HTCondor, LSF, gLite, ARC, Slurm, CMS-CRAB
- Environment sandboxing, configurable on task level
- Docker, Singularity, Sub-Shells, Virutal envs
Contents
First steps
Installation and dependencies
Install via pip
pip install law
conda install -c conda-forge law
If you plan to use remote targets, the (default) implementation also requires gfal2 and gfal2-python (optional) to be installed, either via pip or conda / (micro)mamba.
conda install -c conda-forge gfal2 gfal2-util
Usage at CERN
See the wiki.
Overcomplete example config
See law.cfg.example.
Projects using law
- CMS Di-Higgs Inference Tools:
- columnflow (+ all analyses using it):
- Python based, fully automated, columnar framework, including job submission, resolution of systematics and ML pipelines, starting at NanoAOD-level with an optimized multi-threaded column reader
- repo, docs, task structure
- CMS B-Tag SF Measurement:
- Automated workflow for deriving shape-calibrating b-tag scale factors, starting at MiniAOD-level
- repo
- CMS Tau POG ML Tools:
- Preprocessing pipeline for ML trainings in the TAU group
- repo
- CMS HLT Config Parser:
- Collects information from various databases (HLT, bril, etc.) and shows menus, triggers paths, filter names for configurable MC datasets or data runs
- repo
- RWTH-CMS Analysis Framework:
- Basis for multiple CMS analyses ranging from Di-Higgs, to single Higgs and b-tag SF measurements, starting at NanoAOD-level and based on coffea processors
- repo
- CIEMAT-CMS Analysis Framework:
- Python and RDataFrame based framework starting from NanoAOD and targetting multiple CMS analyses
- repo
- CMS 3D Z+jet 13TeV analysis
- Analysis workflow management from NTuple production to final plots and fits
- repo
- NP-correction derivation tool
- MC generation with Herwig and analysis of generated events with Rivet
- repo
- CMS SUSY Searches at DESY
- Analysis framework for CMS SUSY searches going from custom NanoAODs -> NTuple production -> DNN-based inference -> final plots and fits
- repo
- Kingmaker (CMS Ntuple Production with CROWN)
If your project uses law but is not yet listed here, feel free to open a pull request or mention your project details in a new issue and it will be added.
Examples
All examples can be run either in a Jupyter notebook or a dedicated docker container. For the latter, do
docker run -ti riga/law:example <example_name>
- loremipsum: The hello world example of law.
- workflows: Law workflows.
- workflow_parameters: Alternative way of parametrizing workflows with explicit branch parameters.
- notebooks: Examples showing how to use and work with law in notebooks.
- dropbox_targets: Working with targets that are stored on Dropbox.
- wlcg_targets: Working with targets that are stored on WLCG storage elements (dCache, EOS, ...). TODO.
- htcondor_at_vispa: HTCondor workflows at the VISPA service.
- htcondor_at_cern: HTCondor workflows at the CERN batch infrastructure.
- CMS Crab at CERN: CMS Crab workflows executed from lxplus at CERN.
- sequential_htcondor_at_cern: Continuation of the htcondor_at_cern example, showing sequential jobs that eagerly start once jobs running previous requirements succeeded.
- htcondor_at_naf: HTCondor workflows at German National Analysis Facility (NAF).
- slurm_at_maxwell: Slurm workflows at the Desy Maxwell cluster.
- grid_at_cern: Workflows that run jobs and store data on the WLCG.
- lsf_at_cern: LSF workflows at the CERN batch infrastructure.
- docker_sandboxes: Environment sandboxing using Docker. TODO.
- singularity_sandboxes: Environment sandboxing using Singularity. TODO.
- subshell_sandboxes: Environment sandboxing using Subshells. TODO.
- parallel_optimization: Parallel optimization using scikit optimize.
- notifications: Demonstration of slack and telegram task status notifications..
- CMS Single Top Analysis: Simple physics analysis using law.
Further topics
Auto completion on the command-line
bash
source "$( law completion )"
zsh
zsh is able to load and evaluate bash completion scripts via bashcompinit
.
In order for bashcompinit
to work, you should run compinstall
to enable completion scripts:
autoload -Uz compinstall && compinstall
After following the instructions, these lines should be present in your ~/.zshrc
:
# The following lines were added by compinstall
zstyle :compinstall filename '~/.zshrc'
autoload -Uz +X compinit && compinit
autoload -Uz +X bashcompinit && bashcompinit
# End of lines added by compinstall
If this is the case, just source the law completion script (which internally enables bashcompinit
) and you're good to go:
source "$( law completion )"
Development
- Source hosted at GitHub
- Report issues, questions, feature requests on GitHub Issues
Tests
To run and test law, there are various docker riga/law
images available on the DockerHub, corresponding to different OS and Python versions (based on micromamba).
Start them via
docker run -ti riga/law:<the_tag>
OS | Python | Tags |
---|---|---|
AlmaLinux 9 | 3.11 | a9-py311, a9-py3, a9, py311, py3, latest |
AlmaLinux 9 | 3.10 | a9-py310, py310 |
AlmaLinux 9 | 3.9 | a9-py39, py39 |
AlmaLinux 9 | 3.8 | a9-py38, py38 |
AlmaLinux 9 | 3.7 | a9-py37, py37 |
CentOS 8 | 3.11 | c8-py311, c8-py3, c8 |
CentOS 8 | 3.10 | c8-py310 |
CentOS 8 | 3.9 | c8-py39 |
CentOS 8 | 3.8 | c8-py38 |
CentOS 8 | 3.7 | c8-py37 |
CentOS 7 | 3.10 | c7-py310, c7-py3, c7 |
CentOS 7 | 3.9 | c7-py39 |
CentOS 7 | 3.8 | c7-py38 |
CentOS 7 | 3.7 | c7-py37 |
CentOS 7 | 3.6 | c7-py36, py36 (removed soon) |
CentOS 7 | 2.7 | c7-py27, c7-py2, py27, py2 (removed soon) |
Contributors
Marcel Rieger 💻 👀 🚧 📖 |
Peter Fackeldey 💻 |
Yannik Rath 💻 |
Jaime Leon Holgado 💻 |
Louis Moureaux 💻 |
Lukas Geiger 💻 |
Valentin Iovene 💻 |
This project follows the all-contributors specification.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file law-0.1.19.tar.gz
.
File metadata
- Download URL: law-0.1.19.tar.gz
- Upload date:
- Size: 300.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de539fd4daa9ddb3bbf82490e6c187bf297be9fc289cff2e5da98ede76bb7e43 |
|
MD5 | f37b84e0f9b2f44d473d2754e8d63c23 |
|
BLAKE2b-256 | 3e72257f7a64f59c57aa8bee64c1b2e448787bfecdf204cd839c40ea43a2ee45 |