Immport upload preparation

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python
- Python :: 3
Topic
- Scientific/Engineering :: Bio-Informatics

Project description

Immpload - Immport upload preparation

immpload extracts input data files into files formatted from Immport upload templates.

Prerequisites

Python3 with the pip package installer

Installation

Make a workspace directory.
Install immpload:
```
pip install immpload
```

Usage

The simplest case copies input columns whose name matches the corresponding output Immport template column:

$ immpload subjectAnimals /path/to/input/subjects.txt

which will create the Immport upload file subjectAnimals.txt in the current directory. To place the output in a different directory, use the -o or --outDir option:

$ immpload -o /path/to/output subjectAnimals /path/to/input/subjects.xslx

Note that the input can be either a .xslx Excel spreadsheet or a tab-delimited text file.

The command:

$ immpload --help

shows all immpload arguments and options.

It is often useful to specify the conversion mapping in a YAML configuration file. For example, the following configuration:

columns:
    Subject ID: ID
    Arm Or Cohort ID: Cohort

converts the ID and Cohort input values to Subject ID and Arm Or Cohort ID output values, resp. The command is invoked with the -c or --config option, e.g:

$ immpload -o /path/to/output --config /path/to/conf/subjects.yaml \
           subjectAnimals /path/to/input/subjects.xslx

The configuration can include value mappings, e.g.:

values:
    Species: Mus musculus

sets the output Species to Mus musculus for all rows.

The configuration:

columns:
    Gender: Sex
values:
    Gender:
        n/a: Not Specified

transforms the input Sex value n/a to the output Gender value Not Specified. Other input values are copied without change.

immpload can flatten each input row into several output rows based on matching input column names against a pattern. The configuration:

columns:
    Subject ID: ID
    Arm Or Cohort ID: Cohort
    Study Day: day
patterns:
    Result Value Reported: D(?P<day>\d+)$

converts an input row with columns D1, D2 and D3 into three output rows with column Study Day values 1, 2 and 3 and Result Value Reported values given by the D1, D2 and D3 input values, resp.

immpload supplies certain required output columns with a reasonable default, as follows:

Assessments
- Planned Visit ID - Study ID followed by d and the Study Day
- Panel Name Reported - copied from the Assessment Type
- Assessment Panel ID - derived from the Panel Name Reported
- User Defined ID - derived from the Subject ID, Planned Visit ID and Component Name Reported

For advanced usage, the immpload Python module can be used directly in a Python script with a callback function, e.g.:

from immpload import munger

def add_results(in_row, in_col_ndx_map, out_col_ndx_map, out_row):
    """
    Modifies the output row after the configuration-based conversion.

    :param: in_row: the input data row
    :param: in_col_ndx_map: the input {column: index} dictionary
    :param: out_col_ndx_map: the output {column: index} dictionary
    :param: out_row :the output row
    :return: a list of rows derived from the given output row
    """
    # Modify out_row or create new output rows...
    return [out_row]

# Convert the input file.
munger.munge('assessments', /path/to/input.xslx, callback=add_results)

The munger.munge method signature is as follows:

def munge(template, *in_files, config=None, out_dir=None,
          sheet=None, input_filter=None, callback=None, **kwargs):
    """
    Builds the Immport upload file for the given input file.
    The template is a supported Immport template name, e.g.
    `assessments`. The output is the Immport upload file,
    e.g. `assessments,txt`, placed in the output directory.

    The key word arguments (_kwargs_) are static output
    _column_`=`_value_ definitions that are applied to every
    output row. The column name can be underscored, e.g.
    `Study_ID`.

    :param template: the required Immport template name
    :param in_files: the input file(s) to munge
    :param config: the configuration dictionary or file name
    :param out_dir: the target location (default current directory)
    :param sheet: for an Excel workbook input file, the sheet to open
    :param input_filter: optional generator which has parameter
        input row and yields valid rows
    :param callback: optional callback with parameters
        in_row, in_col_ndx_map, out_col_ndx_map and out_row returning
        an array of rows to write to the output file
    :param kwargs: the optional static _column_`=`_value_ definitions
    :return: the output file name
    """

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 5 - Production/Stable
Environment
- Console
Intended Audience
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python
- Python :: 3
Topic
- Scientific/Engineering :: Bio-Informatics

Release history Release notifications | RSS feed

1.1.2

Sep 4, 2019

1.1.1

Sep 4, 2019

1.0.5

Aug 19, 2019

1.0.4

Jul 31, 2019

1.0.3

Jul 31, 2019

This version

1.0.2

Jul 31, 2019

1.0.1

Jul 31, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

immpload-1.0.2.tar.gz (9.5 kB view hashes)

Uploaded Jul 31, 2019 Source

Hashes for immpload-1.0.2.tar.gz

Hashes for immpload-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`577dfdffd7c4c965592441a98b697547bcfa74dab3452cd6642852171deeadf9`
MD5	`97d7914cf467b8b62f3f13e5c54da763`
BLAKE2b-256	`d6e87cee3e47572e4ccaca12a1144b713bf02a8a5e86647f65a50b160dd9d30f`