Skip to main content

Structural variant caller using low-depth long reads

Project description

accessibility text



NanoVar - Structural variant caller using low-depth long-read sequencing

Build Status PyPI pyversions PyPI versions Conda Github release PyPI license

NanoVar is a neural-network-based genomic structural variant (SV) caller that utilizes low-depth long-read sequencing such as Oxford Nanopore Technologies (ONT). It characterizes SVs with high accuracy and speed using only 4x depth sequencing for homozygous SVs and 8x depth for heterozygous SVs. NanoVar reduces sequencing cost and computational requirements which makes it compatible with large cohort SV-association studies or routine clinical SV investigations.

Basic capabilities

  • Performs long-read mapping (HS-Blastn, Chen et al., 2015) and SV discovery in a single rapid pipeline.
  • Accurately characterizes SVs using long sequencing reads (High SV recall and precision in simulation datasets, overall F1 score >0.9)
  • Characterizes six classes of SVs including novel-sequence insertion, deletion, inversion, tandem duplication, sequence transposition and translocation.
  • Requires 4x and 8x sequencing depth for detecting homozygous and heterozygous SVs respectively.
  • Rapid computational speed (Takes <3 hours to map and analyze 12 gigabases datasets (4x) using 24 CPU threads)
  • Approximates SV genotype

Getting Started

Operating system:

  • Linux (x86_64 architecture, tested in Ubuntu 14.04, 16.04, 18.04)

Installation:

There are three ways to install NanoVar:

Option 1: Conda (Recommended)

# Installing from bioconda automatically installs all dependencies 
conda install -c bioconda nanovar

Option 2: Pip (See dependencies below)

# Installing from PyPI requires own installation of dependencies, see below
pip3 install nanovar

Option 3: GitHub (See dependencies below)

# Installing from GitHub requires own installation of dependencies, see below
git clone https://github.com/cytham/nanovar.git 
cd nanovar 
pip install .

Installation of dependencies

  • bedtools >=2.26.0
  • makeblastdb and windowmasker
  • hs-blastn

Please make sure each executable binary is in PATH.

1. bedtools

Please visit here for instructions to install.

2. makeblastdb and windowmasker
# Download NCBI-BLAST v2.3.0+ from NCBI FTP server
wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.3.0/ncbi-blast-2.3.0+-x64-linux.tar.gz

# Extract tar.gz
tar zxf ncbi-blast-2.3.0+-x64-linux.tar.gz

# Copy makeblastdb and windowmasker binaries to PATH (e.g. ~/bin)
cp ncbi-blast-2.3.0+/bin/makeblastdb ~/bin && cp ncbi-blast-2.3.0+/bin/windowmasker ~/bin
2. hs-blastn
# Download and compile
git clone https://github.com/chenying2016/queries.git
cd queries/hs-blastn-src/
make

# Copy hs-blastn binary to path (e.g. ~/bin)
cp hs-blastn ~/bin

Quick run

nanovar [Options] -t 24 -f hg38 read.fa ref.fa working_dir 
Parameter Argument Comment
-t num_threads Indicate number of CPU threads to use
-f gap_file Choose built-in gap BED file to exclude gap regions in the reference genome. Built-in gap files include: hg19, hg38 and mm10 (Optional)
- read.fa Input long-read FASTA/FASTQ file
- ref.fa Input reference genome in FASTA format
- working_dir Specify working directory

Documentation

See Wiki for more information.

Versioning

See CHANGELOG

Citation

NanoVar: Accurate Characterization of Patients’ Genomic Structural Variants Using Low-Depth Nanopore Sequencing (Tham. et al, 2019) https://www.biorxiv.org/content/10.1101/662940v1

Authors

License

This project is licensed under GNU General Public License - see LICENSE.txt for details.

Simulation datasets

SV-simulated datasets used for evaluating SV calling accuracy can be downloaded here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for nanovar, version 1.2.6
Filename, size File type Python version Upload date Hashes
Filename, size nanovar-1.2.6.tar.gz (298.3 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page