Skip to main content

A set of tools to aid in the identification of false positive variants in Variant Call Files.

Project description

# ExtremeVariantFilter

Extreme Variant Filter is a set of tools developed to aid in the identification of false positive variants in Genomic Variant Call Files based on XGBoost.

### Functions

__apply_filter__

Usage:

apply_filter (–vcf STR) (–snp-model STR) (–indel-model STR) [–verbose]

Description:

Apply models generated by train_model to a VCF.

Arguments:
--vcf STR

VCF to be filtered

--snp-model STR

Model for applying to SNPs

--indel-model INT

Model for applying to InDels

Options:
-h, --help

Show this help message and exit.

-v, --version

Show version and exit.

--verbose

Log output

Examples:

apply_filter –vcf <table> –snp-model <snp.pickle.dat> –indel-model <indel.pickle.dat>

__train_model__

Usage:

train_model (–true-pos STR) (–false-pos STR) (–type STR) [–out STR] [–njobs INT] [–verbose]

Description:

Train a model to be saved and used to filter VCFs.

Arguments:
--true-pos STR

Path to true-positive VCF from VCFeval or comma-seperated list of paths

--false-pos STR

Path to false-positive VCF from VCFeval or comma-seperated list of paths

--type STR

SNP or INDEL

Options:
-o, --out <STR>

Outfile name for writing model [default: (type).filter.pickle.dat]

-n, --njobs <INT>

Number of threads to run in parallel [default: 2]

-h, --help

Show this help message and exit.

-v, --version

Show version and exit.

--verbose

Log output

Examples:

train_table –true-pos <path/to/tp/vcf(s)> –false-pos <path/to/fp/vcf(s)> –type [SNP, INDEL] –njobs 20

### Install

To install and run EVF simply type:

pip install extremevariantfilter

Alternatively, clone this repo and build using the following commands:

git clone https://github.com/stLFR/extremevariantfilter.git cd extremevariantfilter pip install .

### stLFR Paper Results

If you’d like to use this tool to corroborate the results from the [stLFR Paper on Bioarxiv](https://www.biorxiv.org/content/early/2018/05/17/324392.1) paper, the models used for variant filtering are available within the models/ directory. In order to get identical results, after installation, use the command pip install -r requirements.txt from within this directory to ensure your environment matches the one we used for our results. Different versions of certain packages will result in variable results.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extremevariantfilter-0.0a3.tar.gz (2.7 MB view details)

Uploaded Source

Built Distribution

extremevariantfilter-0.0a3-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file extremevariantfilter-0.0a3.tar.gz.

File metadata

  • Download URL: extremevariantfilter-0.0a3.tar.gz
  • Upload date:
  • Size: 2.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10

File hashes

Hashes for extremevariantfilter-0.0a3.tar.gz
Algorithm Hash digest
SHA256 1c55bcfe8b31639e05ed1b5664197e70a98cedfd1c50b690044c8d443dcf8557
MD5 3ce0762b547ddcade0b6fba0c0e5d6e9
BLAKE2b-256 7a292327ff5169cd34759ab12c48028babd4057de14d00e2e9174b1d406f99fd

See more details on using hashes here.

File details

Details for the file extremevariantfilter-0.0a3-py3-none-any.whl.

File metadata

  • Download URL: extremevariantfilter-0.0a3-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10

File hashes

Hashes for extremevariantfilter-0.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 16bac8abbd4f7dacb1325ea53b9cc610c1f6f7812446e0972bc8c4e96d14be52
MD5 e446f510b771f2d94d43555567d05e5e
BLAKE2b-256 7b5a8e085395a10a5a42744fb6d174ba44977a7d8f41141b550974bcb67536a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page