An automated tool for processing whole-exome sequencing data
Project description
An automated tool for processing whole-exome sequencing data
Whole-exome sequencing has been widely used in clinical applications for the identification of the genetic causes of several diseases. HPexome automates many data processing tasks for exome-sequencing data analysis of large-scale cohorts. Given ready-analysis alignment files it is capable of breaking input data into small genomic regions to efficiently process in parallel on cluster-computing environments. It relies on Queue workflow execution engine and GATK variant calling tool and its best practices to output high-confident unified variant calling file. Our workflow is shipped as Python command line tool making it easy to install and use.
Requirements
- BAM files must be sorted in
coordinate
mode. See sort bam files script. - BAM files must have
@RG
tags withID, SM, LB, PL and PU
information. See fix rg tag script.
Example
The following command line takes a list of ready-analysis BAM files stored in alignment_files
directory and reference genomes files (version b37).
Then it breaks input data into smaller parts (--scatter_count 16
) and submits to SGE batch system (--job_runner PbsEngine
).
All samples will be merged into a single VCF files (--unified_vcf
) and output files will be written in result_files
directory.
hpexome \
--bam alignment_files \
--genome references/b37/human_g1k_v37_decoy.fasta \
--dbsnp references/b37/dbsnp_138.b37.vcf \
--indels references/b37/Mills_and_1000G_gold_standard.indels.b37.vcf \
--indels references/b37/1000G_phase1.indels.b37.vcf \
--sites references/b37/1000G_phase1.snps.high_confidence.b37.vcf \
--sites references/b37/1000G_omni2.5.b37.vcf \
--unified_vcf \
--scatter_count 16 \
--job_runner GridEngine \
result_fies
For more information see http://bcblab.org/hpexome.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file HPexome-1.1.2.tar.gz
.
File metadata
- Download URL: HPexome-1.1.2.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80d94e639c5906cef3eb9da0eb2d8a851320e351c2634a3f344a726dcaf281d1 |
|
MD5 | 13be2ec4d071d75782d3391e4f852e9a |
|
BLAKE2b-256 | 5a049d708ba02d768be257d93349e03ce9ad1890669752ed2d14760c8d7bc9b5 |
File details
Details for the file HPexome-1.1.2-py2-none-any.whl
.
File metadata
- Download URL: HPexome-1.1.2-py2-none-any.whl
- Upload date:
- Size: 19.0 kB
- Tags: Python 2
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.34.0 CPython/3.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 961f906562ce8b4cd583aa771dbd2df93bf71b0976b4dd09f778f909bcf6f8c7 |
|
MD5 | a94fcfe3e5ec8ac7c27d1b1c3ad0f737 |
|
BLAKE2b-256 | 42708f4f507350060f332b8eb202c65b0bc8691ec2e51f1ce26840d1cb8f9a49 |