Skip to main content

Models nonlinear interactions between covariates and phenotypes

Project description

DeepNull: Modeling non-linear covariate effects improves phenotype prediction and association power

This repository contains code implementing nonlinear covariate modeling to increase power in genome-wide association studies, as described in "DeepNull: Modeling non-linear covariate effects improves phenotype prediction and association power" (Hormozdiari et al 2021). The code is written using Python 3.7 and TensorFlow 2.4.

Installation

Installation is not required to run DeepNull end-to-end; you can just open DeepNull_e2e.ipynb in colab to try it out.

To install DeepNull locally, run

pip install --upgrade pip
pip install --upgrade deepnull

on a machine with Python 3.7+. This installs a CPU-only version, as there are typically few enough covariates that using accelerators does not provide meaningful speedups.

Verify that the installation is working properly by executing all tests:

python -m deepnull.config_test
python -m deepnull.data_test
python -m deepnull.metrics_test
python -m deepnull.main_test
python -m deepnull.model_test
python -m deepnull.train_eval_test

How to run DeepNull

To run locally, there is a single required input file. This file contains the phenotype of interest and covariates used to predict the phenotype, formatted as a tab-separated file suitable for GWAS analysis with PLINK or BOLT-LMM.

Briefly, the file must contain a single header line. The first two columns must be FID and IID, and all IID values must be unique.

An example command to train DeepNull to predict the phenotype pheno from covariates age, sex, and genotyping_array is the following:

python -m deepnull.main \
  --input_tsv=/input/YOUR_PHENOCOVAR_TSV \
  --output_tsv=/output/YOUR_OUTPUT_TSV \
  --target=pheno \
  --covariates="age,sex,genotyping_array"

To see all available flags, run

python -m deepnull.main --help 2> /dev/null

Of particular note is the --model_config flag. DeepNull uses the ml_collections library to specify all parameters related to the model and training regimen. The supported configuration code is located in config.py, and parameters can be modified as described in detail in the ml_collections README. As a brief example, to use the DeepNull architecture with the elu activation and train with batch size 4096, the above example command would be modified as follows:

python -m deepnull.main \
  --input_tsv=/input/ORIGINAL_PHENOCOVAR_TSV \
  --output_tsv=/output/PHENOCOVAR_WITH_DEEPNULL_PREDICTION_TSV \
  --target=pheno \
  --covariates="age,sex,genotyping_array" \
  --model_config=/path/to/config.py:deepnull \
  --model_config.model_config.mlp_activation=elu \
  --model_config.training_config.batch_size=4096

where /path/to/config.py provides the path to config.py on your machine.

Incorporating DeepNull into a GWAS analysis

The above section, "How to run DeepNull", shows that the DeepNull software adds a single column to a phenotype+covariate file of interest that represents a nonlinear prediction of the target phenotype of interest. To incorporate this into a GWAS analysis, the single additional covariate should be added as an additional covariate. A concrete example with BOLT-LMM, using the same file, phenotype pheno, and covariates age, sex, genotyping_array as above, is shown below:

Original example GWAS command

# N.B. Data loading flags are omitted for brevity.

bolt \
  --phenoFile /input/ORIGINAL_PHENOCOVAR_TSV \
  --covarFile /input/ORIGINAL_PHENOCOVAR_TSV \
  --qCovarCol age \
  --qCovarCol sex \
  --qCovarCol genotyping_array \
  --phenoCol pheno

After running DeepNull on the /input/ORIGINAL_PHENOCOVAR_TSV to create the new TSV /output/PHENOCOVAR_WITH_DEEPNULL_PREDICTION_TSV that includes the column pheno_deepnull, the updated command is given below:

Updated GWAS command to incorporate DeepNull

# N.B. Data loading flags are omitted for brevity.
# Note the addition of the single `--qCovarCol pheno_deepnull` line.

bolt \
  --phenoFile /output/PHENOCOVAR_WITH_DEEPNULL_PREDICTION_TSV \
  --covarFile /output/PHENOCOVAR_WITH_DEEPNULL_PREDICTION_TSV \
  --qCovarCol age \
  --qCovarCol sex \
  --qCovarCol genotyping_array \
  --qCovarCol pheno_deepnull \
  --phenoCol pheno

Data

Datasets used to reproduce the results from the above publication are available to researchers with approved access to the UK Biobank.

NOTE: the content of this research code repository (i) is not intended to be a medical device; and (ii) is not intended for clinical use of any kind, including but not limited to diagnosis or prognosis.

This is not an officially supported Google product.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepnull-0.2.2.tar.gz (24.8 kB view details)

Uploaded Source

Built Distributions

deepnull-0.2.2-py3.8.egg (66.5 kB view details)

Uploaded Source

deepnull-0.2.2-py3-none-any.whl (36.6 kB view details)

Uploaded Python 3

File details

Details for the file deepnull-0.2.2.tar.gz.

File metadata

  • Download URL: deepnull-0.2.2.tar.gz
  • Upload date:
  • Size: 24.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.3

File hashes

Hashes for deepnull-0.2.2.tar.gz
Algorithm Hash digest
SHA256 53027a6d3911ec5d5d61d634381082e26784783e1645ca6bcc3bfc3db4db4edf
MD5 25273e1633aba2588bf8386ab6b919d3
BLAKE2b-256 a4dc071d2d55e76ea8507a8b763732657c5c93af51f6d1a1cf610b182203e258

See more details on using hashes here.

File details

Details for the file deepnull-0.2.2-py3.8.egg.

File metadata

  • Download URL: deepnull-0.2.2-py3.8.egg
  • Upload date:
  • Size: 66.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.3

File hashes

Hashes for deepnull-0.2.2-py3.8.egg
Algorithm Hash digest
SHA256 9a87e5e6f4ceb580204c9fd489d53318da3147fa43421720815b22994d1f8551
MD5 6dce46673c3e90fad9ce12c84ca17d6e
BLAKE2b-256 22053e7c225d73d228a8bbe6423fca952b92bc2b838c84398f2036d9370be3d0

See more details on using hashes here.

File details

Details for the file deepnull-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: deepnull-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 36.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.3

File hashes

Hashes for deepnull-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f6137af922a311b039d5ed142abb1048ba99f85c26fb03a359cc67a3eca3f0d3
MD5 6fc50fe5587b244cf3d2491424488fbe
BLAKE2b-256 63c0e49d68f73b09eb5dec4782aebdada03d2eb458607122c60be21e42622481

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page