A wrapper for gwf for easy generation of output file paths
Project description
gwf target group
This Python package provides a convenient way for automatically generating systematic output filenames for your gwf jobs. This will make defining your gwf jobs a good deal terser. Compare this:
from gwf import Workflow
gwf = Workflow()
foo_file = 'first_step/foo.csv'
bar_file = 'first_step/bar.csv'
plot_file = 'second_step/plot.png'
summary_file = 'second_step/summary.txt'
gwf.target(
'target_group.first_step',
inputs = [],
outputs = [ foo_file, bar_file ],
) << f"first_step_command -f {foo_file} > {bar_file}"
gwf.target(
'target_group.second_step',
inputs = [ foo_file, bar_file ],
outputs = [ plot_file, summary_file ]
) << f"second_step_command -f {foo_file} -b {bar_file} -p {plot_file} > {summary_file}"
to this:
from gwf import Workflow
from gwf_target_group import TargetGroup
gwf = Workflow()
target_group = TargetGroup( gwf, 'target_group', 'output_prefix/' )
target_group(
'first_step',
"first_step_command -f {foo.csv} > {bar.csv}"
) # No input files here. Only 2 output files
target_group(
'second_step',
"run_command -f {foo_file} -b {bar_file} -p {plot.png} > {summary.txt}",
foo_file = target_group.first_step.foo,
bar_file = target_group.first_step.bar
) # Two input files, two output files
With this package you never specify the path to the output files. Only to the
input files. And you can easily refer to the output files by using the automatic
attributes of the target group: target_group.first_step.foo.
Installation
Install via pip:
pip install gwf_target_group
(or alternatively copy the __init__.py from this repository and save if as
gwf_target_group.py at a convenient location)
Advanced usage
Passing gwf options
If you need to fine-tune the options for a gwf job, you can use the
gwf_options parameter:
target_group(
'my_special_processing_step',
'do_special_things {data} > {result.tsv}',
gwf_options = { # gwf_options is a reserved keyword
'memory': '64g',
'walltime': 'unlimited'
},
data = 'path/to/data.tsv'
)
This is roughly equivalent to the following gwf-only code:
gwf.target(
'target_group.my_special_processing_step',
inputs = [ 'path/to/data.tsv' ],
outputs = [ 'path/to/result.tsv' ],
options = {
'memory': '64g',
'walltime': 'unlimited'
}
) << 'do_special_things path/to/data.tsv > path/to/result.tsv'
running workflows with different datasets
Sometimes you want to do the same thing with different datasets. For example, you might have a human and a mouse dataset that you want to analyse. Then you can do the following:
def define_analysis( target_group ):
target_group(
'sort_genes_by_length',
'gene_sorter --by length {genome_file} > {list}',
genome_file = target_group.genome_file # this value was attached previously
)
target_group(
'split_into_test_and_training_datasets',
'split_list -1 list1 -2 list2 {sorted_genes}',
sorted_genes = target_group.sort_genes_by_length.list
)
# more steps can be added here
human = TargetGroup( gwf, 'human', 'human_results/' )
mouse = TargetGroup( gwf, 'mouse', 'mouse_results/' )
# explicitly attach the path to the genome files to the TargetGroups
human.genome_file = 'data/genomes/human.fa'
mouse.genome_file = 'data/genomes/mouse.fa'
# and then define the analysis for both datasets
define_analysis( human )
define_analysis( mouse )
'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gwf_target_group-1.0.1.tar.gz.
File metadata
- Download URL: gwf_target_group-1.0.1.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.7.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72c8fffb4ac23edce55d7fca54012e7db34825095b99c640b29acbf48b133470
|
|
| MD5 |
3ebe5236e8da87ed3ca9e0f88180b84c
|
|
| BLAKE2b-256 |
38d0b27fba3d7ca354d60275a1ff83b366d8dfd0988c4efad5d7f8779051a0ba
|
File details
Details for the file gwf_target_group-1.0.1-py3-none-any.whl.
File metadata
- Download URL: gwf_target_group-1.0.1-py3-none-any.whl
- Upload date:
- Size: 18.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2.post20191203 requests-toolbelt/0.9.1 tqdm/4.40.0 CPython/3.7.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9d1c355b06291be8b83e5f8be35aefcd727d52cfa7eca38377516ab36891067
|
|
| MD5 |
48bb652387cba537618c615cee011177
|
|
| BLAKE2b-256 |
b54127d82171e1e1d36ff4b134affda0c91f2fea3f7509e95898f9698766280e
|