Skip to main content

A set of tools to aid in the identification of false positive variants in Variant Call Files.

Project description

ExtremeVariantFilter

Extreme Variant Filter is a set of tools developed to aid in the identification of false positive variants in Genomic Variant Call Files based on XGBoost.

Functions

apply_filter

Usage:
    apply_filter (--vcf STR) (--snp-model STR) (--indel-model STR) [--verbose]

Description:
    Apply models generated by train_model to a VCF.

Arguments:
    --vcf STR                     VCF to be filtered
    --snp-model STR               Model for applying to SNPs
    --indel-model INT             Model for applying to InDels

Options:
    -h, --help                      Show this help message and exit.
    -v, --version                   Show version and exit.
    --verbose                       Log output

Examples:
    apply_filter --vcf <table> --snp-model <snp.pickle.dat> --indel-model <indel.pickle.dat>

train_model

Usage:
    train_model (--true-pos STR) (--false-pos STR) (--type STR) [--out STR] [--njobs INT] [--verbose]

Description:
    Train a model to be saved and used to filter VCFs.

Arguments:
    --true-pos STR          Path to true-positive VCF from VCFeval or comma-seperated list of paths
    --false-pos STR         Path to false-positive VCF from VCFeval or comma-seperated list of paths
    --type STR              SNP or INDEL

Options:
    -o, --out <STR>                 Outfile name for writing model [default: (type).filter.pickle.dat]
    -n, --njobs <INT>               Number of threads to run in parallel [default: 2]
    -h, --help                      Show this help message and exit.
    -v, --version                   Show version and exit.
    --verbose                       Log output

Examples:
    train_table --true-pos <path/to/tp/vcf(s)> --false-pos <path/to/fp/vcf(s)> --type [SNP, INDEL] --njobs 20

Install

To install and run EVF simply type:

pip install extremevariantfilter

stLFR Paper Results

If you'd like to use this tool to corroborate the results from the stLFR Paper on Bioarxiv paper, the models used for variant filtering are available within the models/ directory. In order to get identical results, after installation, use the command pip install -r requirements.txt from within this directory to ensure your environment matches the one we used for our results. Different versions of certain packages will result in variable results.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extremevariantfilter-0.0a2.tar.gz (2.7 MB view details)

Uploaded Source

Built Distribution

extremevariantfilter-0.0a2-py2-none-any.whl (11.5 kB view details)

Uploaded Python 2

File details

Details for the file extremevariantfilter-0.0a2.tar.gz.

File metadata

  • Download URL: extremevariantfilter-0.0a2.tar.gz
  • Upload date:
  • Size: 2.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10

File hashes

Hashes for extremevariantfilter-0.0a2.tar.gz
Algorithm Hash digest
SHA256 52ca3978c715a7e315f485cc9e904c916a1a4a72bb84bd771076daa333a6d716
MD5 118caa3b45396f3d53ccb33b54c00984
BLAKE2b-256 0cc5d5aa8d395db6ddd2bc136ed276bdb18fb6332119117dca9baa99b5c0e55c

See more details on using hashes here.

File details

Details for the file extremevariantfilter-0.0a2-py2-none-any.whl.

File metadata

  • Download URL: extremevariantfilter-0.0a2-py2-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10

File hashes

Hashes for extremevariantfilter-0.0a2-py2-none-any.whl
Algorithm Hash digest
SHA256 5c06e8b3effe0846a9774db20e922d5cb78c95c9fffb27d18bce15a861fc4609
MD5 eb2f6ca9c51f55748406aa9ad3b01c9f
BLAKE2b-256 4843b4d6755e82163d701f2c9af120435d850f74327c3b6027dce56204517e49

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page