Skip to main content

pitools: A python program for phasing and imputation NGS data.

Project description

Introduction

pitools is a phasing and imputation tools for NGS data, which is the main core of imputation server: https://imputation.cngb.org/. You can use pitools as your own imputation pipeline in your local Linux cluster.

Quick start

pitools use eagle for phasing and Minimac3 for imputation.

Installation

Install the released version by pip:

pip install pitools

Or you may instead want to install the development version from github, by running:

pip install git+git://github.com/ShujiaHuang/pitools.git#egg=pitools

This command will install pitools in your system and you can use pitools in your commandline.

Usage

You can find all the parameter for imputation process by running pitools impute --help:

usage: pitools impute [-h] -C CONFIG [-M IMPUTE_METHOD] [-P PHASE_METHOD] -I IN_VCF
                 -O OUT_PREFIX --refpanel-version REFPANEL --reference-build
                 REFBUILD [--unprephase] [--regions chr:start-end]
                 [--nCPU NCPU]

optional arguments:
  -h, --help            show this help message and exit
  -C CONFIG, --conf CONFIG
                        YAML configuration file specifying details information
                        for imputation
  -M IMPUTE_METHOD, --methods IMPUTE_METHOD
                        Tool for imputation. [minimac]
  -P PHASE_METHOD, --prephase-method PHASE_METHOD
                        Tool for pre-phase before imputation. [eagle]
  -I IN_VCF, --input IN_VCF
                        Input one VCF file to analyze. Required
  -O OUT_PREFIX, --outprefix OUT_PREFIX
                        Prefix for output files. Required
  --refpanel-version REFPANEL
                        The version of haplotype data for reference panel.
                        Required
  --reference-build REFBUILD
                        The build version of reference, e.g: GRCh37
  --unprephase          Do not perform pre-phased before the imputation
                        process.
  --regions chr:start-end
                        Skip positions which not in these regions. This
                        parameter could be a list of comma deleimited genome
                        regions(e.g.: chr:start-end,chr:start-end)
  --nCPU NCPU           Number of threads. [1]

Configuration file

pitools needs a configuration file for setting the path of phasing program, imputation program, reference version and reference panel. Here’s one of the examples for how to create a config- uration file: config.yaml.

Now you can use pitools as your powerful imputation pipeline, once you have finished the setting.

Examples

This command would be enough for most of your jobs.

pitools impute -C config.yaml \
    -I your.vcf.gz \
    -O test_outprefix \
    --refpanel-version 1000G_P3_GRCh37 \
    --reference-build GRCh37 \
    --nCPU 4

What if you just want to preform the imputed process in some specific regions. Here is an example for running pitools in genome region: 21:38347375-38500731 and 22:17203103-17439826.

pitools impute -C config.yaml \
    -I your.vcf.gz \
    -O test_outprefix \
    --refpanel-version 1000G_P3_GRCh37 \
    --reference-build GRCh37 \
    --regions  21:38347375-38500731,22:17203103-17439826 \
    --nCPU 4

PI will perform pre-phasing automatically before perform the imputation process. But sometimes your input VCF file has been phased already. And you don’t want to run it any more then you can set --unprephase argument to skip that process.

pitools impute -C config.yaml \
    -I your.vcf.gz \
    -O test_outprefix \
    --refpanel-version 1000G_P3_GRCh37 \
    --reference-build GRCh37 \
    --unprephase \
    --nCPU 4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pitools-1.0.1.tar.gz (13.7 kB view details)

Uploaded Source

File details

Details for the file pitools-1.0.1.tar.gz.

File metadata

  • Download URL: pitools-1.0.1.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/2.7

File hashes

Hashes for pitools-1.0.1.tar.gz
Algorithm Hash digest
SHA256 8a5e33b18f6f29efdc717f782e937dd83810007460c395ad2debbfd5d57dd4e5
MD5 5f5fa4228eb9cf81b7b955f18766a25e
BLAKE2b-256 0b8500fe213d21b0c721307f3a87828371123de915def0f6e8fa16f23f5a58e5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page