Utilities for writing concise snakemake workflows
Project description
gardnersnake
Utilities for writing concise snakemake workflows
Table of Contents
Introduction
Snakemake is an incredibly powerful workflow manager that enables computational biologists to produce clear, reproducible, and modular analysis pipelines using a familiar Python-based grammar. Unfortunately, the bioinformatics tools that we'd like to utilize inside of our Snakemake workflows are often a bit less well-behaved. Gardnersnake is a small package built on the python standard library (Python 3.6) that aims to make handling this wide variety of tools easier and more compact, especially when working on cluster-based systems.
Class Objects
gardnersnake.ConfigurationHelper()
The foundational object defined in gardnersnake is the ConfigurationHelper Class. At instantiation ConfigurationHelper
takes a single argument, cfg_dict
, which should be the config
snakemake variable capturing the passed workflow configuration file.
Command Line Tools
check_directory
Many bioinformatics tools produce directories of various structure with large numbers of output files. Rather than require Snakemake to keep track of these outputs as global outputs, the check_directory
command-line utility validates output directories against a known set of files, and returns a small file containing a return code (0) if the directory of interest was successfully validated. check_directory
throws an error and does not return the return code file if it is unable to validate the contents according to the given requirements.
The options and requirements are specified in the usage message and can be retrieved using the -h
or --help
flags.
check_directory --help
usage: check_directory [-h] [--strict] [-o OUT] FILES [FILES ...] DIR
validates dynamic directory contents against expectations
positional arguments:
FILES set of filepaths to check against dir contents
DIR filepath of directory to verify
optional arguments:
-h, --help show this help message and exit
--strict directory should contain only the passed files
-o OUT, --output OUT name of return code output filei
Positional Options \
FILES
[required] a list of whitespace separated files to search for in the passed directory. these file names should be specified without their path extensions (i.e. a file whose full path is /home/user/analysis/myoutputs/output1.txt should be passed as output1.txt if theDIR
is indicated to be /home/user/analysis/myoutputs/)DIR
[required] is the full path of the directory to verify.~/
conventions are acceptable but shell variable syntax such as$WORKDIR
are not supported. Relative path functionality remains in active development but is not guaranteed to work as of the current version (0.1.0)
Flagged Options \
--ouput -o
[required] specifies the name of the file generated (containing the return code) when the passed directory is successfully validated.--strict
[optional] indicates that the passed directory should only contain the files listed in theFILES
positional argument, and no other files or subdirectories. the default setting, nonstrict will validate directories containing extra files so long as the required ones are present. This gives the user the ability to be more or less permissive with their checks. Typical usage may look like:
check_dir -o rc.out --strict output1.txt output2.txt ~/myanalysis/outputs/
which should return a file called rc.out if the folder ~/myanalysis/outputs/ has exactly two files --> output1.txt and output2.txt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file gardnersnake-0.1.1.tar.gz
.
File metadata
- Download URL: gardnersnake-0.1.1.tar.gz
- Upload date:
- Size: 17.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 125bb3581fbad0813f131a119ab30d36cb8f0eaf72b6440fd804fd356db48fdd |
|
MD5 | 6213f450e810c2c7288171d2c7435f6f |
|
BLAKE2b-256 | 2122d76fd4b9a8bd752a5578330239a59d7d5a846f822a0db80cab213f68e8c2 |
Provenance
File details
Details for the file gardnersnake-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: gardnersnake-0.1.1-py3-none-any.whl
- Upload date:
- Size: 19.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/4.8.3 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ee0be102f77cfae19f1eb6add075ecc9ece0107742ff86cedfac99623304b9a |
|
MD5 | 44c813c78db3b0c74da7d6d60a0c97f1 |
|
BLAKE2b-256 | 4294f526d2d62381509e4f68eec0909a0884094a741e53778903025d784ce473 |