Skip to main content

A Python package that adjusts GWAS summary statistics for the effects of linkage disequilibrium (LD)

Project description

# LDpred #

LDpred is a Python based software package that adjusts GWAS summary statistics for the effects of linkage disequilibrium (LD). The details of the method is described in Vilhjalmsson et al. (AJHG 2015) [http://www.cell.com/ajhg/abstract/S0002-9297(15)00365-1]

  • The current version is 1.0.11

### News ###

Recent improvements have focused on making LDpred more robust, addressing issues highlighted by recent publications (Ge et al., Nat Comm 2019; Choi and O’Reilly, GigaScience 2019; Privé et al., AJHG 2019).

  • Nov 20th, 2019, v. 1.0.11: Implemented LDpred-fast Joel Mefford’s sparsified BLUP prediction (Mefford, thesis 2018). LDpred-fast is suitable for polygenic diseases/traits when LDpred-gibbs fails to converge or is too slow.

  • Oct 21st, 2019, v. 1.0.10: LDpred-gibbs now reports LDpred-inf effects for SNPs in long-range LD regions (Price et al., AJHG 2008). This improves convergence of the algorithm substantially when applied to large datasets.

  • Oct 17st, 2019, v. 1.0.8: Fixed a bug in LDpred that could improve convergence for gibbs.

  • Oct 11th, 2019, v. 1.0.7: Improved accuracy and robustness. - Now able to handle variants with p-values rounded down to 0. - Fixed a serious bug that caused sample sizes in summary stats file not always being used correctly when provided. - LDpred gibbs can now handle differing sample sizes per variant effects, if they are parsed in summary stats. - LDpred now estimates the heritabiliy it separately for each chromosome by default.

## Getting Started ## LDpred can be installed using pip on most systems by typing

pip install ldpred

### Requirements ### LDpred currently requires three Python packages to be installed and in path. These are h5py [http://www.h5py.org/](http://www.h5py.org/), scipy [http://www.scipy.org/](http://www.scipy.org/) and libplinkio [https://github.com/mfranberg/libplinkio](https://github.com/mfranberg/libplinkio). Lastly, LDpred has currently only been tested with Python 3.6+.

The first two packages h5py and scipy are commonly used Python packages, and pre-installed on many computer systems. The last libplinkio package can be installed using pip (https://pip.pypa.io/en/latest/quickstart.html), which is also pre-installed on many systems.

With pip, one can install libplinkio using the following command:

pip install plinkio

or if you need to install it locally you can try

pip install –user plinkio

With these three packages in place, you should be all set to install and use LDpred.

### Installing LDpred ###

As with most Python packages, configurating LDpred is simple. You can use pip to install it by typing

pip install ldpred

This should automatically take care of dependencies. The examples below assume ldpred has been installed using pip.

Alternatively you can use git (which is installed on most systems) and clone this repository using the following git command:

git clone https://github.com/bvilhjal/ldpred.git

Finally, you can also download the source files and place them somewhere.

With the Python source code in place and the three packages h5py, scipy, and libplinkio installed, then you should be ready to use LDpred.

### How to run tests ### A couple of simulated data examples can be found in the test_data directory. These datasets were simulated using two different values of p (fraction of causal markers) and with heritability set to 0.1. The sample size used when simulating the summary statistics is 10,000.

### Code Contributions ### I encourage users to extend the code, and adapt it too their needs. Currently there are no formal guidelines set for contributions, and pull requests will be reviewed on a case by case basis.

### Who do I talk to? ### If you have any questions or trouble getting the method to work, try first to look at issues, to see if it is reported there. Also, you can check if some of the cloned LDpred repos have addressed your issue.

In emergencies, please contact Bjarni Vilhjalmsson (bjarni.vilhjalmsson@gmail.com), but expect slow replies.

## Using LDpred ## A typical LDpred workflow consists of 3 steps:

### Step 1: Coordinate data ### The first step is a data synchronization step, where two or three data sets, genotypes and summary statistics are synchronized. This generates a HDF5 file which contains the synchronized genotypes. This step can be done by running

ldpred coord

use –help for detailed options. This step requires at least one genotype file (the LD reference genotypes), where we recommend at least 1000 unrelated individuals with the same ancestry make-up as the individuals for which summary statistics datasets are obtained from. Another genotype file can also be given if the user intends to validate the predictions using a separate set of genotypes.

### Step 2: Generate LDpred SNP weights ### After generating the coordinated data file then the one can apply LDpred and run it on the synchronized dataset. This step can be done by running

ldpred gibbs

use –help for detailed options. This step generates two files, a LD file with LD information for the given LD radius, and the re-weighted effect estimates. The LD file enables the user to not have to generate the LD file again when trying, e.g., different values of p (the fraction of causal variants). However, it is re-generated if a different LD radius is given. The other file that LDpred generates contains the LDpred-adjusted effect estimates.

### Step 3: Generating individual risk scores ### Individual risk scores can be generated using the following command

ldpred score

use –help for detailed options. It calculates polygenic risk scores for the individuals in the validation data if given, otherwise it treats the LD reference genotypes as validation genotypes. A phenotype file can be provided, covariate file, as well as plink-formatted principal components file.

### Additional methods: LD-pruning + Thresholding ### In addition to the LDpred gibbs sampler and infinitesimal model methods, the package also implements LD-pruning + Thresholding as an alternative method. You can run this using the following command

ldpred p+t

This method often yields better predictions than LDpred when the LD reference panel is small, or when the training data is very large (due to problems with gibbs sampler convergence).

### Tests ### You can run a test to see if LDpred work on your system by running the following tests

ldpred-unittest

Note that passing this test does not guarantee that LDpred work in all situations.

### Citation ### Please cite [this paper](https://doi.org/10.1016/j.ajhg.2015.09.001)

### Acknowledges ### Thanks to all who provided bug reports and contributed code.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

LDpred-1.0.11.tar.gz (48.4 kB view details)

Uploaded Source

Built Distributions

LDpred-1.0.11-py3-none-any.whl (38.1 MB view details)

Uploaded Python 3

LDpred-1.0.11-py2-none-any.whl (38.1 MB view details)

Uploaded Python 2

File details

Details for the file LDpred-1.0.11.tar.gz.

File metadata

  • Download URL: LDpred-1.0.11.tar.gz
  • Upload date:
  • Size: 48.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for LDpred-1.0.11.tar.gz
Algorithm Hash digest
SHA256 2c1b4f4143857cf8cd7170dccd55a15ef2af1ff2de91732fd4db82ca1e0c72ef
MD5 89aae444ff6ab4e6bffedc06effb95af
BLAKE2b-256 87493ca0efdb5913b672e1073e3146e933b0b31d9071b9cdb0df2321abf065b2

See more details on using hashes here.

File details

Details for the file LDpred-1.0.11-py3-none-any.whl.

File metadata

  • Download URL: LDpred-1.0.11-py3-none-any.whl
  • Upload date:
  • Size: 38.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for LDpred-1.0.11-py3-none-any.whl
Algorithm Hash digest
SHA256 1adff28d0fe01792f01f1bd41bc7dd3b0057279c621613fb7bf4a5ba103b89db
MD5 30feba7eae59c351a8abdeab4f19ea30
BLAKE2b-256 6f9f0f9f47441e5f8c08fd04d0122f973db3b9630de34a14e3c04fabe768969f

See more details on using hashes here.

File details

Details for the file LDpred-1.0.11-py2-none-any.whl.

File metadata

  • Download URL: LDpred-1.0.11-py2-none-any.whl
  • Upload date:
  • Size: 38.1 MB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.6.0 requests-toolbelt/0.9.1 tqdm/4.38.0 CPython/2.7.17

File hashes

Hashes for LDpred-1.0.11-py2-none-any.whl
Algorithm Hash digest
SHA256 b1ceeeb828613e76ae73dda490229b1337ccf688963804c4bcb86328d9771db8
MD5 e31e7790b3910f0fda40417db12cbc73
BLAKE2b-256 6e8b0563be8dfcd23619b9f1a80ddeaef3f2a453e55391d42100d5cf4811f049

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page