Skip to main content

A tool to visualize the haplotype pattern and various information in excel.

Project description

HAPPE

A tool to visualize the haplotype pattern and various information in excel. Please cite this paper when using HAPPE for your publications

Cong Feng, Xingwei Wang, Shishi Wu, Weidong Ning, Bo Song, Jianbin Yan, and Shifeng Cheng. 2022. “HAPPE: A Tool for Population Haplotype Analysis and Visualization in Editable Excel Tables.” Frontiers in Plant Science 13 (July): 927407. https://doi.org/10.3389/fpls.2022.927407.

avatar

Installing HAPPE

There easiest way to install HAPPE is to use pip3.

pip3 install HAPPE

or you can clone the project to your local directory and installing it with:

python3 setup.py install --record log.txt
#if u want to uninstall the package:
#cat log.txt | xargs rm -rf

then you should have the HAPPE command available.

$ HAPPE -h

usage: HAPPE [-h] -g CONFIG -v GZVCF [-k KEEP] [-r REGION]
                          [-s SNPLIST] -i INF -c COLOR [-I SNPINF] [-R REF]
                          [-F FUNCANN] [-f | -x | -n] [-D DEPTH] [-d DEPTHBIN]
                          -o OUTPUT

show haplotype patterns in excel file./fengcong@caas.cn

optional arguments:
  -h, --help            show this help message and exit
  -g CONFIG, --config CONFIG
                        config file.[required]
  -v GZVCF, --gzvcf GZVCF
                        gzvcf, bcftools indexed.use to annotation and get
                        ref/alt basepair.[required]
  -k KEEP, --keep KEEP  keep sample, if u wana plot a subset of
                        --gzvcf.[optional]
  -r REGION, --region REGION
                        if u wana plot a subset of --gzvcf, u can use this
                        option. if u use this option , ucant use -s
                        option[optional]
  -s SNPLIST, --snplist SNPLIST
                        snp id list(format:chr_pos). if u use this option , u
                        cant use -r option.[optional]
  -w TREEWIDTH, --treewidth TREEWIDTH
                        How many columns do you want to occupy for this tree
                        topology.(default=1000)[optional]
  -i INF, --inf INF     the information of each sample.[required]
  -c COLOR, --color COLOR
                        the color of each sample.[required]
  -I SNPINF, --snpinf SNPINF
                        more information about SNP.[optional]
  -R REF, --Ref REF     change Reference and color system.[optional]
  -F FUNCANN, --FuncAnn FUNCANN
                        functional annotation file.[optional]
  -f, --functional      only functional SNP
  -x, --coding          only coding region SNP
  -n, --noncoding       only noncoding region SNP
  -D DEPTH, --Depth DEPTH
                        depth dir for each sample.[optional]
  -d DEPTHBIN, --Depthbin DEPTHBIN
                        Depth bin size.[optional,default:50]
  -o OUTPUT, --output OUTPUT
                        output prefix

Preparing config file

[software]
bgzip=
bcftools=
tabix=

Preparing the vcf file

  1. The SNP/INDEL ID must be in the format :Chromosome_position.
  2. Only bi-allelic remains in vcf file.
  3. Compress vcf to vcf.gz using bgzip
  4. Use bcftools index to create an index for the vcf.gz file.

Preparing the depth information

if you want to integrate the depth information, you need to prepare the depth file as follows:

  1. Create a directory for each sample with the name of the sample.
  2. using mosdepth to calc the depth of each position for each sample.
#example
mosdepth -f ref.fa -Q 0 sample1/sample1.Q0  path/to/sample1.bam

Usage

"-g  CONFIG", required parameter, give the paths to bcftools, bgzip and tabix in the CONFIG file. 

"-v GZVCF", required parameter, input vcf file.

"-k SAMPLELIST", required parameter, list of samples to be retained, one sample per line.

"-r REGION", optional parameter, the genomic region to be displayed, format: chromosome: start-end.

"-s VARIANTLIST", optional parameter, the list of variant IDs you need to keep, using this parameter you cannot use the -r parameter.

"-w TREEWIDTH", optional parameter, the width of the tree topology.

"-i INFORMATION", optional parameter, additional sample information, the first column must be the sample ID.

"-c COLOR", optional parameter, Specify the color of each sample, the first column is the sample id and the second column is the color hex code.

"-I VARINFORMATION", optional parameter, Additional variant annotation information, such as GWAS p-value. the first colum is the variant id and each column is the annotation information with header.

"-f", optional parameter, Only the variant that changes the amino acid is retained.( Requires that the input vcf file has been annotated with SnpEff.)

"-x", optional parameter, Only the variant in the CDS region is retained.( Requires that the input vcf file has been annotated with SnpEff.)

"-n" optional parameter, Only the variant in the non-coding region is retained.( Requires that the input vcf file has been annotated with SnpEff.)

"-D DIRECTORY" optional parameter, This directory contains the depth information for each sample calculated using mosdepth, one directory per sample.

"-d WINDOWSIZE", optional parameter, window size for calculate normalized depth.

"-o PREFIX" required parameter, output prefix.

example

HAPPE \
-g config.ini \
-v test.vcf.gz \
-r chr7A:71669854-71670886 \
-i 1000_Inf.txt \
-c 1000.pop.color \
-F FunctionalAnnotation_v1__HCgenes_v1.0.TAB \
-D path/to/depth_data/ \
-f \
-o test
## each file of the prameter
## -g config.ini
# [software]
# bgzip=path_to/bgzip
# bcftools=path_to/bcftools
# tabix=path_to/tabix

## -i 1000_Inf.txt
## Just make sure the first column is the sample name.
# Sample_ID	... ...
# sample1   ... ...

## -c 1000.pop.color
## Just make sure the first column is the sample name and the second column is color code.
# Sample_ID	color
# sample1	FF0000
# ...       ...

## -F FunctionalAnnotation_v1__HCgenes_v1.0.TAB
## just make sure the first column is the gene name , and the forth column is the functional annotation.
# Gene_name	XXX XXX function ... ...
# gene1     XXX XXX func1    ... ...

## -D path/to/depth_data/
##Make sure that the files *mosdepth.summary.txt and *per-base.bed.gz are in the directory for each sample in this directory.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

HAPPE-0.1.4.tar.gz (24.5 kB view details)

Uploaded Source

Built Distributions

HAPPE-0.1.4-py3.7.egg (64.8 kB view details)

Uploaded Egg

HAPPE-0.1.4-py3-none-any.whl (33.4 kB view details)

Uploaded Python 3

File details

Details for the file HAPPE-0.1.4.tar.gz.

File metadata

  • Download URL: HAPPE-0.1.4.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/36.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.62.2 importlib-metadata/1.6.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.4

File hashes

Hashes for HAPPE-0.1.4.tar.gz
Algorithm Hash digest
SHA256 5868bffeadccb7dbf133e413435c2d46ee5d966a6d671d75a91b0899291c2f7c
MD5 5e1796ad1d97bf272018462a74bd1d48
BLAKE2b-256 f1f6b7c5d435ddf941b631ef4bd44f255123b59229088caffc76d1b70fb250f4

See more details on using hashes here.

File details

Details for the file HAPPE-0.1.4-py3.7.egg.

File metadata

  • Download URL: HAPPE-0.1.4-py3.7.egg
  • Upload date:
  • Size: 64.8 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/36.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.62.2 importlib-metadata/1.6.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.4

File hashes

Hashes for HAPPE-0.1.4-py3.7.egg
Algorithm Hash digest
SHA256 5e70e3c19867e7e038b41385348cad108d3e68cbcccf8844239fe9171237d1d3
MD5 34c183c1b26829efda8c0d2de2e76e2c
BLAKE2b-256 b56f9f8832d7825872df84dba9ee724b71d153f786a4a313cee60de1b7695483

See more details on using hashes here.

File details

Details for the file HAPPE-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: HAPPE-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 33.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/36.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.62.2 importlib-metadata/1.6.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.4

File hashes

Hashes for HAPPE-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 3195e2ad0c7ff18ed2a9f9020b9df1a203f6aa5f3cc4f661f4121919a3d664ae
MD5 47a30fcee6a45a6c31f6cee180813168
BLAKE2b-256 d53228ac383e6104ca662dec6083f031a966a3e6e353842f3b03a4181548df18

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page