fmrib-unpack

The FMRIB UKBiobank Normalisation, Parsing And Cleaning Kit

These details have not been verified by PyPI

Project links

Project description

https://img.shields.io/pypi/v/fmrib-unpack.svg

https://anaconda.org/conda-forge/fmrib-unpack/badges/version.svg

https://zenodo.org/badge/DOI/10.5281/zenodo.1997626.svg

https://git.fmrib.ox.ac.uk/fsl/funpack/badges/main/coverage.svg

FUNPACK is a Python application for pre-processing of UK Biobank phenotype data.

FUNPACK is developed at the Oxford Centre for Integrative Neuroimaging (OxCIN@FMRIB), University of Oxford. FUNPACK is in no way endorsed, sanctioned, or validated by the UK Biobank.

Installation

Install FUNPACK from conda-forge:

conda install -c conda-forge fmrib-unpack

Or using pip:

pip install fmrib-unpack

The FUNPACK source code can be found at https://git.fmrib.ox.ac.uk/fsl/funpack/.

Introductory notebook

The fmrib_unpack_demo command will start a Jupyter Notebook which introduces the main features provided by FUNPACK. A non-interactive version of this notebook can be found here.

If you are using pip, you need to install a few additional dependencies:

pip install fmrib-unpack[demo]

You can then start the demo by running fmrib_unpack_demo.

Usage

General usage is as follows:

fmrib_unpack [options] output.tsv input1.tsv input2.tsv

You can get information on all of the options by typing fmrib_unpack --help.

The fmrib_unpack command was called funpack in older versions of FUNPACK, but was changed to fmrib_unpack in 3.0.0 to avoid a naming conflict with an unrelated software package.

Options can be specified on the command line, and/or stored in a configuration file. For example, the options in the following command line:

fmrib_unpack                     \
  --overwrite                    \
  --write_log                    \
  --icd10_map_file icd_codes.tsv \
  --category 10                  \
  --category 11                  \
  output.tsv input1.tsv input2.tsv

Could be stored in a configuration file config.txt:

overwrite
write_log
icd10_map_file icd_codes.tsv
category       10
category       11

And then executed as follows:

fmrib_unpack -cfg config.txt output.tsv input1.tsv input2.tsv

Features

FUNPACK allows you to perform various data sanitisation and processing steps on your data, such as:

NA value replacement: Specific values for some data-fields can be replaced with NA, for example, data-fields where a value of -1 indicates Do not know.

Categorical recoding: Certain categorical data-fields can re-coded. For example, data-fields where a value of 555 represents half can be recoded so that 555 is replaced with 0.5.

Child value replacement: NA values within some data-fields which are dependent upon other data-fields may have values inserted based on the values of their parent data-fields.

See the overview for a more comprehensive overview of the features available in FUNPACK.

Built-in rules

FUNPACK contains a large number of built-in rules which have been specifically written to pre-process UK Biobank data-fields (also referred to as variables). These rules are stored in the following files [*]:

funpack/configs/fmrib/datacodings_*.tsv: Cleaning rules for data-codings

funpack/configs/fmrib/variables_*.tsv: Cleaning rules for individual data-fields

funpack/configs/fmrib/processing.tsv: Processing steps

funpack/configs/fmrib/categories.tsv: Data-field categories

You can use these rules by using the FMRIB configuration profile:

fmrib_unpack -cfg fmrib output.tsv input.tsv

You can customise or replace these files as you see fit. You can also pass your own versions of these files to FUNPACK via the --variable_file, --datacoding_file, --type_file, --processing_file, and --category_file command-line options respectively. FUNPACK will load all data-field and data-coding files, and merge them into a single table which contains the cleaning rules for each data-field.

FUNPACK also comes bundled with a copy of the UK Bioobank schema, containing metadata about all UK Biobank data-fields. The schema can be obtained from the UK Biobank online data showcase

Creating your own rule files

To define rules at the data-coding level, create one or more .tsv files with an ID column containing the data-coding ID, and any of the following columns:

NAValues: A comma-separated list of values to replace with NA

RawLevels A comma-separated list of values to be replaced with corresponding values in NewLevels.

NewLevels A comma-separated list of replacement values for each of the values listed in RawLevels.

To apply these rules, pass your .tsv file(s) to funpack with the --datacoding_file option. They will be applied to all data-fields which use the data-coding(s) listed in the file(s).

To define rules at the data-field level, create one or more .tsv files with an ID column containing the data-field ID, and any of the following columns:

NAValues: As above

RawLevels As above

NewLevels As above

ParentValues: A comma-separated list of expressions on parent data-field, defining conditions which should trigger child-value replacement.

ChildValues: A comma-separated list of values to insert into the data-field when the corresponding expression in ParentValues evaluates to true.

Clean: A comma-separated list of cleaning functions to apply to the data-field.

Output

The main output of FUNPACK is a plain-text file [†] which contains the input data, after cleaning and processing, potentially with some columns removed, and new columns added.

If you used the --suppress_non_numerics option, the main output file will only contain the numeric columns. You can combine this with the --write_non_numerics option to save non-numeric columns to a separate file.

You can use any tool of your choice to load this output file, such as Python, MATLAB, or Excel. It is also possible to pass the output back into FUNPACK.

Tests

To run the test suite, you need to install some additional dependencies:

pip install fmrib-unpack[test]

Then you can run the test suite using pytest:

pytest

macOS issues

FUNPACK makes extensive use of the Python multiprocessing module to speed up certain steps in its processing pipeline. FUNPACK relies on the POSIX fork() mechanism, so that worker processes may inexpensively inherit the memory space of the main process (often referred to as copy-on-write). This is to avoid having to serialise the data set being processed (stored internally as a pandas.DataFrame).

In python 3.8 on macOS, the default method used by the multiprocessing module was changed from fork to spawn, due to changes in macOS 10.13 restricting the use of fork() for safety reasons. Some background information on this change can be found at https://bugs.python.org/issue33725, and at this blog post.

FUNPACK therefore explicitly sets the method used by the multiprocessing to fork, to take advantage of copy-on-write semantics. Using fork() on macOS should be safe for single-threaded parent processes, but as FUNPACK calls fork() numerous times (by creating and discarding multiprocessing.Pool() objects on an as-needed basis), this assumption may not be valid, and FUNPACK may crash with an error message resembling the following:

+[SomeClass initialize] may have been in progress in another thread
when fork() was called. We cannot safely call it or ignore it in the
fork() child process. Crashing instead.

You might be able to work around this error by setting an environment variable before calling FUNPACK, like so:

export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
fmrib_unpack ...

Citing

If you would like to cite FUNPACK, please refer to its Zenodo page.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

4.0.1

Jun 15, 2026

4.0.0

Apr 12, 2026

3.8.1

May 15, 2024

3.8.0

Dec 13, 2023

3.7.1

Sep 22, 2023

3.7.0

Apr 17, 2023

3.6.0

Feb 3, 2023

3.5.2

Aug 18, 2022

3.5.0

Aug 5, 2022

3.4.0

Jul 29, 2022

3.3.1

Jun 28, 2022

3.3.0

Jun 27, 2022

3.2.3

Jun 2, 2022

3.2.2

May 31, 2022

3.2.1

May 31, 2022

3.2.0

May 13, 2022

3.1.0

May 6, 2022

3.0.0

Jan 5, 2022

2.9.1

Dec 29, 2021

2.9.0

Dec 28, 2021

2.8.0

Aug 19, 2021

2.7.1

Jun 22, 2021

2.7.0

May 14, 2021

2.6.0

Mar 29, 2021

2.5.2

Mar 15, 2021

2.5.1

Mar 3, 2021

2.5.0

Dec 9, 2020

2.4.0

Nov 27, 2020

2.3.3

Oct 5, 2020

2.3.2

Jun 10, 2020

2.3.1

May 27, 2020

2.3.0

May 13, 2020

2.1.0

Apr 22, 2020

2.0.0

Apr 7, 2020

1.9.0

Feb 28, 2020

1.8.2

Feb 27, 2020

1.8.1

Feb 19, 2020

1.8.0

Feb 18, 2020

1.7.1

Jan 30, 2020

1.7.0

Jan 24, 2020

1.6.0

Dec 12, 2019

1.5.0

Dec 9, 2019

1.4.5

Dec 5, 2019

1.4.2

Oct 22, 2019

1.4.1

Jul 8, 2019

1.4.0

Jul 7, 2019

1.3.2

Jun 4, 2019

1.3.1

May 30, 2019

1.3.0

May 29, 2019

1.2.1

May 28, 2019

1.2.0

May 25, 2019

1.1.4

May 17, 2019

1.1.3

May 17, 2019

1.1.2

May 16, 2019

1.1.0

May 14, 2019

1.0.2

May 14, 2019

1.0.1

May 10, 2019

1.0.0

May 10, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fmrib_unpack-4.0.1.tar.gz (2.1 MB view details)

Uploaded Jun 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fmrib_unpack-4.0.1-py3-none-any.whl (2.1 MB view details)

Uploaded Jun 15, 2026 Python 3

File details

Details for the file fmrib_unpack-4.0.1.tar.gz.

File metadata

Download URL: fmrib_unpack-4.0.1.tar.gz
Upload date: Jun 15, 2026
Size: 2.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for fmrib_unpack-4.0.1.tar.gz
Algorithm	Hash digest
SHA256	`90a35548f10ebc075a96cc3c11235a0bbe1f2bdd7467dd89dabed52d855dd7d6`
MD5	`091d98944b4fe16c1ae628c9e3a6ab62`
BLAKE2b-256	`79ad62915597378e484534eb8503d4ba1b9c5314d922d0c82edd29bc385a51fd`

See more details on using hashes here.

File details

Details for the file fmrib_unpack-4.0.1-py3-none-any.whl.

File metadata

Download URL: fmrib_unpack-4.0.1-py3-none-any.whl
Upload date: Jun 15, 2026
Size: 2.1 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for fmrib_unpack-4.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5f98e250b6bbff5781f4c6051fd8d2b87186f2f941a3abff2029d0cc5c384d78`
MD5	`36f2733064674f3d872a9931cf4a5afa`
BLAKE2b-256	`ca2a651ec4ea9159c3dbeaecbd3e2c1cb527402779387669f9dc69a64d3b0453`

See more details on using hashes here.

fmrib-unpack 4.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Installation

Introductory notebook

Usage

Features

Built-in rules

Creating your own rule files

Output

Tests

macOS issues

Citing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes