Skip to main content

A set of tools to aid in the identification of false positive variants in Variant Call Files.

Project description

ExtremeVariantFilter

Extreme Variant Filter is a set of tools developed to aid in the identification of false positive variants in Genomic Variant Call Files based on XGBoost.

Functions

apply_filter

Usage:
    apply_filter (--vcf STR) (--snp-model STR) (--indel-model STR) [--verbose]

Description:
    Apply models generated by train_model to a VCF.

Arguments:
    --vcf STR                     VCF to be filtered
    --snp-model STR               Model for applying to SNPs
    --indel-model INT             Model for applying to InDels

Options:
    -h, --help                      Show this help message and exit.
    -v, --version                   Show version and exit.
    --verbose                       Log output

Examples:
    apply_filter --vcf <table> --snp-model <snp.pickle.dat> --indel-model <indel.pickle.dat>

train_model

Usage:
    train_model (--true-pos STR) (--false-pos STR) (--type STR) [--out STR] [--njobs INT] [--verbose]

Description:
    Train a model to be saved and used to filter VCFs.

Arguments:
    --true-pos STR          Path to true-positive VCF from VCFeval or comma-seperated list of paths
    --false-pos STR         Path to false-positive VCF from VCFeval or comma-seperated list of paths
    --type STR              SNP or INDEL

Options:
    -o, --out <STR>                 Outfile name for writing model [default: (type).filter.pickle.dat]
    -n, --njobs <INT>               Number of threads to run in parallel [default: 2]
    -h, --help                      Show this help message and exit.
    -v, --version                   Show version and exit.
    --verbose                       Log output

Examples:
    train_table --true-pos <path/to/tp/vcf(s)> --false-pos <path/to/fp/vcf(s)> --type [SNP, INDEL] --njobs 20

Install

To install and run EVF simply type:

pip install extremevariantfilter

stLFR Paper Results

If you'd like to use this tool to corroborate the results from the stLFR Paper on Bioarxiv paper, the models used for variant filtering are available within the models/ directory. In order to get identical results, after installation, use the command pip install -r requirements.txt from within this directory to ensure your environment matches the one we used for our results. Different versions of certain packages will result in variable results.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

extremevariantfilter-0.0a1.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

extremevariantfilter-0.0a1-py2-none-any.whl (11.5 kB view details)

Uploaded Python 2

File details

Details for the file extremevariantfilter-0.0a1.tar.gz.

File metadata

  • Download URL: extremevariantfilter-0.0a1.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10

File hashes

Hashes for extremevariantfilter-0.0a1.tar.gz
Algorithm Hash digest
SHA256 e667c394c35d26b10102afdc9357262b5b9082b102fa17d1b54cd6bd100dc09e
MD5 fa3a8c9cb6653990c2763dabff872a14
BLAKE2b-256 c8dc1f2d4afd3af7810e573c0728db31f2d4b19df728d7ff15b85b722ee81218

See more details on using hashes here.

File details

Details for the file extremevariantfilter-0.0a1-py2-none-any.whl.

File metadata

  • Download URL: extremevariantfilter-0.0a1-py2-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/2.7.10

File hashes

Hashes for extremevariantfilter-0.0a1-py2-none-any.whl
Algorithm Hash digest
SHA256 138b25b7293da3be008efb2e6dc73c49d4114c86d87fb05b01194dede27b096f
MD5 50b5c990697090e187875c1744bf52b1
BLAKE2b-256 d550e130bb79348932feb9654802f60d9cd078b848145636eeda4347731f0198

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page