Skip to main content

A set of tools to aid in the identification of false positive variants in Variant Call Files.

Project description

ExtremeVariantFilter

Extreme Variant Filter is a set of tools developed to aid in the identification of false positive variants in Genomic Variant Call Files based on XGBoost.

Functions

apply_filter

Usage:
    apply_filter (--vcf STR) (--snp-model STR) (--indel-model STR) [--verbose]

Description:
    Apply models generated by train_model to a VCF.

Arguments:
    --vcf STR                     VCF to be filtered
    --snp-model STR               Model for applying to SNPs
    --indel-model INT             Model for applying to InDels

Options:
    -h, --help                      Show this help message and exit.
    -v, --version                   Show version and exit.
    --verbose                       Log output

Examples:
    apply_filter --vcf <table> --snp-model <snp.pickle.dat> --indel-model <indel.pickle.dat>

train_model

Usage:
    train_model (--true-pos STR) (--false-pos STR) (--type STR) [--out STR] [--njobs INT] [--verbose]

Description:
    Train a model to be saved and used to filter VCFs.

Arguments:
    --true-pos STR          Path to true-positive VCF from VCFeval or comma-seperated list of paths
    --false-pos STR         Path to false-positive VCF from VCFeval or comma-seperated list of paths
    --type STR              SNP or INDEL

Options:
    -o, --out <STR>                 Outfile name for writing model [default: (type).filter.pickle.dat]
    -n, --njobs <INT>               Number of threads to run in parallel [default: 2]
    -h, --help                      Show this help message and exit.
    -v, --version                   Show version and exit.
    --verbose                       Log output

Examples:
    train_table --true-pos <path/to/tp/vcf(s)> --false-pos <path/to/fp/vcf(s)> --type [SNP, INDEL] --njobs 20

Install

To install and run EVF simply type:

pip install extremevariantfilter

Alternatively, clone this repo and build using the following commands:

git clone https://github.com/stLFR/extremevariantfilter.git
cd extremevariantfilter
pip install .

stLFR Paper Results

If you'd like to use this tool to corroborate the results from the stLFR Paper on Bioarxiv paper, the models used for variant filtering are available within the models/ directory. In order to get identical results, after installation, use the command pip install -r requirements.txt from within this directory to ensure your environment matches the one we used for our results. Different versions of certain packages will result in variable results.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extremevariantfilter-0.0a4.tar.gz (2.7 MB view details)

Uploaded Source

Built Distribution

extremevariantfilter-0.0a4-py2-none-any.whl (11.6 kB view details)

Uploaded Python 2

File details

Details for the file extremevariantfilter-0.0a4.tar.gz.

File metadata

  • Download URL: extremevariantfilter-0.0a4.tar.gz
  • Upload date:
  • Size: 2.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10

File hashes

Hashes for extremevariantfilter-0.0a4.tar.gz
Algorithm Hash digest
SHA256 6add6dd252c703e5951afcffbad406d1d8c81f1465ec104ee835ad00136a9780
MD5 fd54bc14b7db33ab8fdc6bbe7f3f9870
BLAKE2b-256 fea79173b6e985e301c95c50dfdc074b9e401801266c4fa6eb6ac73bbd574db6

See more details on using hashes here.

File details

Details for the file extremevariantfilter-0.0a4-py2-none-any.whl.

File metadata

  • Download URL: extremevariantfilter-0.0a4-py2-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10

File hashes

Hashes for extremevariantfilter-0.0a4-py2-none-any.whl
Algorithm Hash digest
SHA256 eba0ec19e3bbf2fac9982960d53a0b8c808189ce511250ff1ce543d908c761ea
MD5 2bf2b7fb45de18aab80c250f5c3192c3
BLAKE2b-256 4ec290c9625f9b980903eb3df303721d0903403c4c178da52f4dd4cbdfda44b7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page