An automated tool for processing whole-exome sequencing data

Project description

An automated tool for processing whole-exome sequencing data

Whole-exome sequencing has been widely used in clinical applications for the identification of the genetic causes of several diseases. HPexome automates many data processing tasks for exome-sequencing data analysis of large-scale cohorts. Given ready-analysis alignment files it is capable of breaking input data into small genomic regions to efficiently process in parallel on cluster-computing environments. It relies on Queue workflow execution engine and GATK variant calling tool and its best practices to output high-confident unified variant calling file. Our workflow is shipped as Python command line tool making it easy to install and use.

Requirements

BAM files must be sorted in coordinate mode. See sort bam files script.
BAM files must have @RG tags with ID, SM, LB, PL and PU information. See fix rg tag script.

Example

The following command line takes a list of ready-analysis BAM files stored in alignment_files directory and reference genomes files (version b37). Then it breaks input data into smaller parts (--scatter_count 16) and submits to SGE batch system (--job_runner PbsEngine). All samples will be merged into a single VCF files (--unified_vcf) and output files will be written in result_files directory.

hpexome \
    --bam alignment_files \
    --genome references/b37/human_g1k_v37_decoy.fasta  \
    --dbsnp references/b37/dbsnp_138.b37.vcf \
    --indels references/b37/Mills_and_1000G_gold_standard.indels.b37.vcf \
    --indels references/b37/1000G_phase1.indels.b37.vcf \
    --sites references/b37/1000G_phase1.snps.high_confidence.b37.vcf \
    --sites references/b37/1000G_omni2.5.b37.vcf \
    --unified_vcf \
    --scatter_count 16 \
    --job_runner GridEngine \
    result_fies

For more information see http://bcblab.org/hpexome.

Project details

Release history Release notifications | RSS feed

1.2.1

Jan 30, 2020

1.2.0

Jan 30, 2020

1.1.2

Oct 11, 2019

This version

1.1.1

Aug 19, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

HPexome-1.1.1.tar.gz (5.9 kB view details)

Uploaded Aug 19, 2019 Source

Built Distribution

HPexome-1.1.1-py3-none-any.whl (19.0 kB view details)

Uploaded Aug 19, 2019 Python 3

File details

Details for the file HPexome-1.1.1.tar.gz.

File metadata

Download URL: HPexome-1.1.1.tar.gz
Upload date: Aug 19, 2019
Size: 5.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4

File hashes

Hashes for HPexome-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`c3ab4c7f0e74f0a935309be0000bedb78317d77e157eb86ede3160d1af745667`
MD5	`9d6a2b11d8bfe24c4f048f1ae00dd61c`
BLAKE2b-256	`0b43f31ba34b8324d4a380d707d91752e5e659006902b5f6e85337fb0ddb7d66`

See more details on using hashes here.

File details

Details for the file HPexome-1.1.1-py3-none-any.whl.

File metadata

Download URL: HPexome-1.1.1-py3-none-any.whl
Upload date: Aug 19, 2019
Size: 19.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4

File hashes

Hashes for HPexome-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f19e8eac01f74128c777187308533385939584e2038bcc8a0e52f26061ecbe82`
MD5	`9ed206bbb0c017dc5d3c54ae9454adce`
BLAKE2b-256	`bc33f8fed33247a98a0c7577bc030781b2cd9cc740709116a8f35a11fbdedbf6`