Impute GWAS summary statistics using reference genotype data
Functionally-informed Z-score Imputation (FIZI)
FIZI leverages functional information together with reference linkage-disequilibrium (LD) to impute GWAS summary statistics (Z-score).
This README is a working draft and will be expanded soon.
The easiest way to install
pyfizi is through conda and conda-forge:
conda config --add channels conda-forge
conda install pyfizi
Alternatively you can use pip for installation:
pip install pyfizi
Or directly from the github repository:
git clone email@example.com:bogdanlab/fizi.git
pip install .
Check that FIZI was installed by typing
If that did not work, and
pip install pyfizi --user was specified, please check that your local user path is included in
$PATH environment variable.
--user location and can be appended to
export PATH=`python -m site --user-base`/bin/:$PATH
which can be saved in
~/.bash_profile. To reload the environment type
source ~/.bashrc or
source ~/.bash_profile depending where you entered it.
We currently only support Python3.7+. Python2.7 and below is not supported
fizi has two main functions:
munge subcommand is a pruned down version of the LDSC munge_sumstats software with a few bells and whistles needed for our imputation algorithm. The
impute subcommand performs summary statistic imputation using either the functionally informed algorithm (i.e.
fizi) or using only reference-LD-only algorithm (i.e. ImpG). For a full list of features please refer to the help command:
fizi munge -h or
fizi impute -h.
Imputing summary statistics using only reference LD
When functional annotations and LDSC estimates are not provided to
fizi, it will fallback to the classic ImpG
algorithm described in ref 1. To impute missing summary statistics only for chromosome 1 using the ImpG algorithm
simply enter the commands
1. fizi munge gwas.sumstat.gz --out cleaned.gwas
2. fizi impute cleaned.gwas.sumstat.gz plink_data_path --chr 1 --out imputed.cleaned.gwas.chr1.sumstat
fizi requires that at least 50% of SNPs to be observed for imputation at a region. This can be changed with the
--min-prop PROP flag in step 2.
Incorporating functional data to improve summary statistics imputation
Usage consists of several steps. We outline the general workflow here when the intention to perform imputation on chromosome 1 of our data:
Munge/clean all GWAS summary data before imputation
fizi munge gwas.sumstat.gz --out cleaned.gwas
Partitioning cleaned GWAS summary data into chr1 and everything else (loco-chr1).
Run LDSC on locoChr to obtain tau estimates
Perform functionally-informed imputation on chr1 data using tau estimates from loco-chr
Software and support
For performing various inferences using summary data from large-scale GWASs please find the following useful software:
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.