Biobanking data processing, annotation, and association workflows

These details have not been verified by PyPI

Project description

Biobanking

Systematic collection, processing, storage, and analysis of biological samples and associated health records for medical research.

Supported pipelines

Preprocess

Contains biobank-specific modules for EHR data collection, cleaning, and processing.

QC (Under construction)

Will contain biobank-specific modules for variant quality control and filtering.

Annotation (Under construction)

Will contain biobank-specific modules for variant annotation.

Association

Contains biobank-specific modules for genotype-phenotype association tests.

Supported biobanks

All of Us

The All of Us biobank consists of coupled whole genome sequencing and electronic health record data of more than 400k individuals, with continued expansion.

UK Biobank (Under construction)

The UK Biobank consists of coupled whole genome sequencing and electronic health record data of ~500k participants.

AoU REGENIE workflow

The All of Us association utilities currently support a packaged regenie workflow with three Step 2 modes:

Burden association testing
Mask-only runs for writing burden-mask PLINK datasets
Interaction testing using the same burden inputs and optional interaction flags

The workflow implementation lives in src/biobanking/workflows/regenie.wdl, and the Python utilities live in src/biobanking/association/aou.py.

The tracking model is phenotype-centered:

Step 1 is tracked once per phenotype prefix
Step 2 runs are tracked separately by mode
workflow metadata is written locally and synced to the workspace bucket

This keeps LOCO and prediction reuse aligned with the phenotype definition rather than with any specific burden or interaction run.

Recommended usage pattern

Run or reuse Step 1 once per phenotype prefix.
Use burden runs for standard gene-based tests.
Use mask runs to materialize chromosome-wide or gene-specific burden-mask PLINK files.
Use interaction runs only after Step 1 exists for the phenotype prefix you are testing.

More detailed usage examples are in docs/workflows.md.

Internal use

python -m pip install -U pip build
pip install twine
# linux
rm -rf dist build *.egg-info src/*.egg-info
# windows
Remove-Item -Recurse -Force dist, *.egg-info, src\*.egg-info
python -m build
pip install dist/biobanking-0.0.12-py3-none-any.whl
python -c "from biobanking.association.aou import REGENIE; regenie = REGENIE(); from biobanking.preprocess.aou.measurements import save_measurements_in_wide_format; print('import ok')"
twine upload dist/*

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.0.17

Apr 22, 2026

0.0.16

Apr 14, 2026

0.0.15

Apr 14, 2026

0.0.14

Apr 14, 2026

0.0.13

Apr 13, 2026

This version

0.0.12

Apr 10, 2026

0.0.11

Apr 10, 2026

0.0.10

Apr 9, 2026

0.0.9

Mar 5, 2026

0.0.8

Mar 5, 2026

0.0.7

Mar 5, 2026

0.0.6

Mar 5, 2026

0.0.5

Feb 26, 2026

0.0.4

Feb 26, 2026

0.0.3

Feb 25, 2026

0.0.2

Feb 24, 2026

0.0.1

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

biobanking-0.0.12.tar.gz (283.9 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

biobanking-0.0.12-py3-none-any.whl (328.6 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file biobanking-0.0.12.tar.gz.

File metadata

Download URL: biobanking-0.0.12.tar.gz
Upload date: Apr 10, 2026
Size: 283.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for biobanking-0.0.12.tar.gz
Algorithm	Hash digest
SHA256	`d1deef5bd140f1e9091a104e2cbb2ed07bb3dd287d91b925987951445fcedccf`
MD5	`f702ffabe34d4b992d4b1a42cfc76fc6`
BLAKE2b-256	`258f2c20ffcfddd85120efc5aed188a4de04ee20b811ea9217dfc3577a220a0c`

See more details on using hashes here.

File details

Details for the file biobanking-0.0.12-py3-none-any.whl.

File metadata

Download URL: biobanking-0.0.12-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 328.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for biobanking-0.0.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a2aebe5049d339468c3b14ce1d2539e826fd61fdb1a59a579f09bc5ec51dbae7`
MD5	`8ca271b11d9c06b42c927080f9ec78b7`
BLAKE2b-256	`971b775ed6ac90715e67162fe9cceedc36d84762072af07e87643e06ff99860b`

See more details on using hashes here.

biobanking 0.0.12

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Biobanking

Supported pipelines

Preprocess

QC (Under construction)

Annotation (Under construction)

Association

Supported biobanks

All of Us

UK Biobank (Under construction)

AoU REGENIE workflow

Recommended usage pattern

Internal use

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes