A pipeline for analyzing fungal genomic data
vbtools: a variant benchmark tool to compare VCFs with a consensus set
# clone this repo git clone email@example.com:broadinstitute/vbtools.git # setup conda environment cd vbtools conda env create -f env.yml # this will take a few minutes conda list # verify new environment was installed correctly # activate environment conda activate vbtools # deactivate the environment when done conda deactivate # completely remove the virtual environment conda remove -name vbtools --all
You can use following command to benchmark a VCF against a reference/consensus VCF.
vcfbench.py -v <input.vcf> -b <reference.vcf>
--prefix is an option to define prefix to the output files.
Currently, only haploid VCF is supported in the analysis. Diploid VCF will be standardized into haploid before comparison. Input VCF should follow VCF spec v4.2.
Following pre-processing steps are performed on the input VCF before the analysis:
- remove unused alleles
- remove monomorphic sites
- remove sites with heterozygous genotypes
- remove non-SNP sites
- remove sites with asterisk marks
- change diploid to haploid VCF
The script will output:
- Site level comparison:
- a tsv file including number of unique and shared sites.
- Sample level comparision:
- The sample level comparison functionality will be added to the script soon.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size vbtools-0.1.0-py3-none-any.whl (7.8 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size vbtools-0.1.0.tar.gz (5.1 kB)||File type Source||Python version None||Upload date||Hashes View|