No project description provided

These details have not been verified by PyPI

Project description

bidsarray

Compile a tabular output of input and output files from a bids dataset. Output can be fed into gnu parallel to run arbitrary commands on an entire bids dataset with parallelization!

Command syntax

Prelude

The first part of a bidsarray call consists of prelude arguments that initialize the input and output dirs:

bidsarray INPUT_DIR OUTPUT_DIR [--derivatives] \
    [--participant-label LABEL ...] [--exclude-participant-label LABEL ...] \
    [--pybidsdb-dir DIR] [--pybidsdb-reset]

--derivatives enables indexing of derivative datasets (in the derivatives/ folder). --participant-label allows the specification of one or more subject labels (the bit after sub: sub-LABEL). Only files from these subjects will be produced. --exclude-participant-label does the same but excludes subjects. --pybidsdb-dir can be used to specify a pybids database created via the database_path argument in bids.BIDSLayout. This can speed up indexing for large datasets. --pybidsdb-reset forces reindexing of the database.

Components

Components are specified after the prelude, each seperated by ::: (just like in gnu parallel!). So a complete command looks like this:

bidsarray <PRELUDE...> ::: <COMPONENT 1> ::: <COMPONENT 2> ::: ...

Each component may be specified as an input or an output. Inputs are read from an existing dataset, outputs are created based on the provided inputs.

Inputs

Input components are specified as follows:

::: --input [label] [--groupby ENTITY ...] [--aggregate ENTITY ...] [--filter ENTITY[:METHOD]=VALUE ...]

The label is optional: if provided, it will add a header row to the top of the tabular output. If a label is provided to any one component, all components must receive a label!

--filter narrows the set of selected paths to those containing the selected entity-values. For example, to select diffusion images, one might use:

::: --input --filter suffix=dwi datatype=dwi extension=.nii.gz

METHOD tells the filter how to do the selection. By default, it looks for an exact match, but regex can also be used using :match and :search:

::: --input --filter 'suffix:match=[Tt][12]w?'

Overall, --filter tends to REDUCE the number of rows in the output.

--groupby and --aggregate both mark variable parts of the path. Often, they'll be entities such as subject, session, and run. These variable entities are referred to generically as wildcards.

--groupby will create a seperate row per path matched by the filters and wildcards. This is used if you want to run a tool seperately on each file. For instance, if you have a group of images, and you want to apply a smoothing function to each one seperately, you may use:

::: --input --filter suffix=T1w extension=.nii.gz --groupby subject session

--aggregate joins multiple paths into the same row. This is useful if you want to perform an aggregation, such as an average or standard deviation. For example, if you want to get an average fractional anisotropy map from all subjects, you may use:

::: --input --filter desc=FA suffix=mdp extension=.nii.gz --aggregate subject session

These two flags can be combined. So to get an average FA map for each subject across all sessions, you could use:

::: --input --filter desc=FA suffix=mdp extension=.nii.gz --groupby subject --aggregate session

Outputs

Outputs are generated using the bids function from snakebids. The syntax is:

::: --output [label] --entities [ENTITY=VALUE ...]

Each ENTITY_VALUE pair provided to --entities specifies a static value that should be applied to all wildcards. For instance, continuing our smoothing example, you may use:

::: --input --filter suffix=T1w extension=.nii.gz --groupby subject session ::: --output --entities datatype=anat suffix=T1w desc=smoothed

Importantly, any wildcards specified using --groupby are AUTOMATICALLY provided to each output. So in the above example, our table will include a row for each subject-session combination, with a correctly formatted output path.

Feeding outputs to `parallel`

Basics

While the tabular output could be used for any number of purposes, it really shines in combination with gnu parallel. Parallel is a powerful and complicated tool; its full usage can be found on its documentation. Here we show just the basics using it with bidsarray.

bidsarray <PRELUDE...> ::: <COMPONENT 1> ::: <COMPONENT 2> | parallel --colsep '\t' echo 1={1} 2={2}

The --colsep argument to parallel allows it to read from an incoming columnar data. We use \t as the argument to read bidsarray tabular data.

Examples

In this example, we apply a transform to all of the fractional anisotropy maps of a preprocessed diffusion dataset:

bidsarray . derivatives/template --derivatives \
    ::: --input --filter suffix=mdp desc=FA extension=.nii.gz space=participant --groupby subject session \
    ::: --input --filter suffix=xfm from=participant to=MNI6 extension=.nii.gz --groupby subject session \
    ::: --input --filter suffix=T1w extension=.nii.gz space=MNI6 \
    ::: --output --entities desc=FA datatype=dwi space=MNI6 suffix=mdp.nii.gz |
    parallel --bar --colsep '\t' antsApplyTransform -d3 -i {1} -o {4} -r {3} -t {2}

Using labels

The above examples use numeric ids for argument substitution, which may be difficult when handling many components. It's possible to use labels instead:

bidsarray . derivatives/template --derivatives \
    ::: --input image --filter suffix=mdp desc=FA extension=.nii.gz space=participant --groupby subject session \
    ::: --input transform --filter suffix=xfm from=participant to=MNI6 extension=.nii.gz --groupby subject session \
    ::: --input reference --filter suffix=T1w extension=.nii.gz space=MNI6 \
    ::: --output out --entities desc=FA datatype=dwi space=MNI6 suffix=mdp.nii.gz |
    parallel --bar --colsep '\t' --header : antsApplyTransform -d3 -i {image} -o {out} -r {reference} -t {transform}

Note that the use of labels in parallel is not compatible with all features.

Aggregation commands with `parallel`

Parallel automatically shell escapes each column when using --colsep, including spaces, but bidsarray uses spaces to seperate files that should be aggregated. So if we try to get the average of set of files, parallel will read the the filenames as one giant filename. So we need to use a trick to get this to work. There are two basic approaches (both complements of this SO Q/A):

The simplest is to prepend the command with eval to remove all escapes:

bidsarray . derivatives/average \
    ::: --input --filter suffix=T1w extension=.nii.gz --groupby subject session --aggregate run \
    ::: --output --entities suffix=T1w.nii.gz datatype=anat \
    parallel --bar --colsep '\t' eval mrcalc {1} -add {2}

This may not work with complex commands, as eval will aggressively strip away quotes. For a more surgical approach, you can use the uq function:

bidsarray . derivatives/average \
    ::: --input --filter suffix=T1w extension=.nii.gz --groupby subject session --aggregate run \
    ::: --output --entities suffix=T1w.nii.gz datatype=anat \
    parallel --bar --colsep '\t' mrcalc {=1 uq=} -mean {2}

uq will only apply to the targeted variable. This approach is not compatible with labels.

Creating output folders

Note that for most commands (especially any involving --groupby), you will likely need to create the ouput folders as part of the command. Use a command like this:

bidsarray . derivatives/template --derivatives \
    ::: --input --filter suffix=mdp desc=FA extension=.nii.gz space=participant --groupby subject session \
    ::: --input --filter suffix=xfm from=participant to=MNI6 extension=.nii.gz --groupby subject session \
    ::: --input --filter suffix=T1w extension=.nii.gz space=MNI6 \
    ::: --output --entities desc=FA datatype=dwi space=MNI6 suffix=mdp.nii.gz |
    parallel --bar --colsep '\t' mkdir -p {4} \&\& antsApplyTransform -d3 -i {1} -o {4} -r {3} -t {2}

Motivation and Design

bidsarray was built both to be a useful tool and a demonstration of the bidsapp module in snakebids. The parsing, components, and CLI are all provided by snakebids, so bidsarray can organize a tabular output with just a few small files.

snakebids.bidsapp uses a system of hooks and plugins to build an app. In <bidsarray/run.py>, three hooks can be seen:

get_argv: Retrieve the provided CLI arguments and split them at :::. This will allow the app to seperately parse the command prelude and each of the components. The prelude arguments are returned to be parsed by the bidsapp, and the components are saved into the config to be parsed later.
finalize_config: The prelude has now been parsed by bidsapp. We retrieve the components arguments we saved earlier and parse them (using functions in <bidsarray/component.py>). The results are saved back into config
run: We use configured components we calculated earlier and call generate_inputs, which uses pybids to parse the input dataset and create a BidsDataset. The methods on this dataset can be used to retrieve and organize the indexed paths.

snakebids.bidsapp.app is used to create the app with several plugins, providing the basic functionality. The last plugin: sys.modules[__name__], loads the hooks defined in the run.py file so that the app will work. Finally, the entrypoint app.run() is called behind an if __name__ == "__main__" block. It's also specified in the pyproject.toml file as a script.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

May 9, 2024

0.1.0

May 9, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bidsarray-0.1.1.tar.gz (7.6 kB view details)

Uploaded May 9, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bidsarray-0.1.1-py3-none-any.whl (8.9 kB view details)

Uploaded May 9, 2024 Python 3

File details

Details for the file bidsarray-0.1.1.tar.gz.

File metadata

Download URL: bidsarray-0.1.1.tar.gz
Upload date: May 9, 2024
Size: 7.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.11.5 Linux/5.15.146.1-microsoft-standard-WSL2

File hashes

Hashes for bidsarray-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`06deb763935afc8fe993711974dba2c3e1088cbdf6dc2c451bde362e42673e62`
MD5	`5a13b63b75c721ed359ae6b0ff9aa0e1`
BLAKE2b-256	`46170056e0962509231da4e6fc862ff7409f172466aee97323aa1fe25dc6ad9e`

See more details on using hashes here.

File details

Details for the file bidsarray-0.1.1-py3-none-any.whl.

File metadata

Download URL: bidsarray-0.1.1-py3-none-any.whl
Upload date: May 9, 2024
Size: 8.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.7.1 CPython/3.11.5 Linux/5.15.146.1-microsoft-standard-WSL2

File hashes

Hashes for bidsarray-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8fd658c672b710ecbce7b967fe85babaabd78d55d6de7dd04a90e45998991c7e`
MD5	`bc8f521ca4c734fb919d230caa6de6f7`
BLAKE2b-256	`5ffb68055fc544548b0a0d916eac917e94b790396a8a1b0c9f0ae9f55c66cd30`

See more details on using hashes here.

bidsarray 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

bidsarray

Command syntax

Prelude

Components

Inputs

Outputs

Feeding outputs to `parallel`

Basics

Examples

Using labels

Aggregation commands with `parallel`

Creating output folders

Motivation and Design

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

bidsarray 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

bidsarray

Command syntax

Prelude

Components

Inputs

Outputs

Feeding outputs to parallel

Basics

Examples

Using labels

Aggregation commands with parallel

Creating output folders

Motivation and Design

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Feeding outputs to `parallel`

Aggregation commands with `parallel`