A set of tools to aid in the identification of false positive variants in Variant Call Files.
Project description
# ExtremeVariantFilter
Extreme Variant Filter is a set of tools developed to aid in the identification of false positive variants in Genomic Variant Call Files based on XGBoost.
### Functions
__apply_filter__
- Usage:
apply_filter (–vcf STR) (–snp-model STR) (–indel-model STR) [–verbose]
- Description:
Apply models generated by train_model to a VCF.
- Arguments:
- --vcf STR
VCF to be filtered
- --snp-model STR
Model for applying to SNPs
- --indel-model INT
Model for applying to InDels
- Options:
- -h, --help
Show this help message and exit.
- -v, --version
Show version and exit.
- --verbose
Log output
- Examples:
apply_filter –vcf <table> –snp-model <snp.pickle.dat> –indel-model <indel.pickle.dat>
__train_model__
- Usage:
train_model (–true-pos STR) (–false-pos STR) (–type STR) [–out STR] [–njobs INT] [–verbose]
- Description:
Train a model to be saved and used to filter VCFs.
- Arguments:
- --true-pos STR
Path to true-positive VCF from VCFeval or comma-seperated list of paths
- --false-pos STR
Path to false-positive VCF from VCFeval or comma-seperated list of paths
- --type STR
SNP or INDEL
- Options:
- -o, --out <STR>
Outfile name for writing model [default: (type).filter.pickle.dat]
- -n, --njobs <INT>
Number of threads to run in parallel [default: 2]
- -h, --help
Show this help message and exit.
- -v, --version
Show version and exit.
- --verbose
Log output
- Examples:
train_table –true-pos <path/to/tp/vcf(s)> –false-pos <path/to/fp/vcf(s)> –type [SNP, INDEL] –njobs 20
### Install
To install and run EVF simply type:
pip install extremevariantfilter
Alternatively, clone this repo and build using the following commands:
git clone https://github.com/stLFR/extremevariantfilter.git cd extremevariantfilter pip install .
### stLFR Paper Results
If you’d like to use this tool to corroborate the results from the [stLFR Paper on Bioarxiv](https://www.biorxiv.org/content/early/2018/05/17/324392.1) paper, the models used for variant filtering are available within the models/ directory. In order to get identical results, after installation, use the command pip install -r requirements.txt from within this directory to ensure your environment matches the one we used for our results. Different versions of certain packages will result in variable results.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file extremevariantfilter-0.0a3.tar.gz
.
File metadata
- Download URL: extremevariantfilter-0.0a3.tar.gz
- Upload date:
- Size: 2.7 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c55bcfe8b31639e05ed1b5664197e70a98cedfd1c50b690044c8d443dcf8557 |
|
MD5 | 3ce0762b547ddcade0b6fba0c0e5d6e9 |
|
BLAKE2b-256 | 7a292327ff5169cd34759ab12c48028babd4057de14d00e2e9174b1d406f99fd |
File details
Details for the file extremevariantfilter-0.0a3-py3-none-any.whl
.
File metadata
- Download URL: extremevariantfilter-0.0a3-py3-none-any.whl
- Upload date:
- Size: 12.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16bac8abbd4f7dacb1325ea53b9cc610c1f6f7812446e0972bc8c4e96d14be52 |
|
MD5 | e446f510b771f2d94d43555567d05e5e |
|
BLAKE2b-256 | 7b5a8e085395a10a5a42744fb6d174ba44977a7d8f41141b550974bcb67536a6 |