Skip to main content

A set of tools to aid in the identification of false positive variants in Variant Call Files.

Project description

ExtremeVariantFilter

Extreme Variant Filter is a set of tools developed to aid in the identification of false positive variants in Genomic Variant Call Files based on XGBoost.

Functions

apply_filter

Usage:
    apply_filter (--vcf STR) (--snp-model STR) (--indel-model STR) [--verbose]

Description:
    Apply models generated by train_model to a VCF.

Arguments:
    --vcf STR                     VCF to be filtered
    --snp-model STR               Model for applying to SNPs
    --indel-model INT             Model for applying to InDels

Options:
    -h, --help                      Show this help message and exit.
    -v, --version                   Show version and exit.
    --verbose                       Log output

Examples:
    apply_filter --vcf <table> --snp-model <snp.pickle.dat> --indel-model <indel.pickle.dat>

train_model

Usage:
    train_model (--true-pos STR) (--false-pos STR) (--type STR) [--out STR] [--njobs INT] [--verbose]

Description:
    Train a model to be saved and used to filter VCFs.

Arguments:
    --true-pos STR          Path to true-positive VCF from VCFeval or comma-seperated list of paths
    --false-pos STR         Path to false-positive VCF from VCFeval or comma-seperated list of paths
    --type STR              SNP or INDEL

Options:
    -o, --out <STR>                 Outfile name for writing model [default: (type).filter.pickle.dat]
    -n, --njobs <INT>               Number of threads to run in parallel [default: 2]
    -h, --help                      Show this help message and exit.
    -v, --version                   Show version and exit.
    --verbose                       Log output

Examples:
    train_table --true-pos <path/to/tp/vcf(s)> --false-pos <path/to/fp/vcf(s)> --type [SNP, INDEL] --njobs 20

Install

To install and run EVF simply type:

pip install extremevariantfilter

stLFR Paper Results

If you'd like to use this tool to corroborate the results from the stLFR Paper on Bioarxiv paper, the models used for variant filtering are available within the models/ directory. In order to get identical results, after installation, use the command pip install -r requirements.txt from within this directory to ensure your environment matches the one we used for our results. Different versions of certain packages will result in variable results.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extremevariantfilter-0.0a1.tar.gz (7.7 kB view hashes)

Uploaded Source

Built Distribution

extremevariantfilter-0.0a1-py2-none-any.whl (11.5 kB view hashes)

Uploaded Python 2

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page