nanovar

Structural variant caller using low-depth long reads

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

accessibility text

NanoVar - Structural variant caller using low-depth long-read sequencing

NanoVar is a genomic structural variant (SV) caller that utilizes low-depth long-read sequencing such as Oxford Nanopore Technologies (ONT). It characterizes SVs with high accuracy and speed using only 4x depth sequencing for homozygous SVs and 8x depth for heterozygous SVs. NanoVar reduces sequencing cost and computational requirements which makes it compatible with large cohort SV-association studies or routine clinical SV investigations.

Basic capabilities

Performs long-read mapping (Minimap2 and HS-BLASTN) and SV discovery in a single rapid pipeline.
Accurately characterizes SVs using long sequencing reads (High SV recall and precision in simulation datasets, overall F1 score >0.9)
Characterizes six classes of SVs including novel-sequence insertion, deletion, inversion, tandem duplication, sequence transposition and translocation.
Requires 4x and 8x sequencing depth for detecting homozygous and heterozygous SVs respectively.
Rapid computational speed (Takes <3 hours to map and analyze 12 gigabases datasets (4x) using 24 CPU threads)
Approximates SV genotype

Getting Started

Quick run

nanovar [Options] -t 24 -f hg38 sample.fq/sample.bam ref.fa working_dir

Parameter	Argument	Comment
`-t`	num_threads	Indicate number of CPU threads to use
`-f` (Optional)	gap_file (Optional)	Choose built-in gap BED file or specify own file to exclude gap regions in the reference genome. Built-in gap files include: hg19, hg38 and mm10
-	sample.fq/sample.bam	Input long-read FASTA/FASTQ file or mapped BAM file
-	ref.fa	Input reference genome in FASTA format
-	working_dir	Specify working directory

Output

Output file	Comment
${sample}.nanovar.pass.vcf	Final VCF filtered output file (1-based)
${sample}.nanovar.pass.report.html	HTML report showing run summary and statistics

For more information, see wiki.

Operating system:

Linux (x86_64 architecture, tested in Ubuntu 14.04, 16.04, 18.04)

Installation:

There are three ways to install NanoVar:

Option 1: Conda (Recommended)

# Installing from bioconda automatically installs all dependencies 
conda install -c bioconda nanovar

Option 2: Pip (See dependencies below)

# Installing from PyPI requires own installation of dependencies, see below
pip install nanovar

Option 3: GitHub (See dependencies below)

# Installing from GitHub requires own installation of dependencies, see below
git clone https://github.com/cytham/nanovar.git 
cd nanovar 
pip install .

Installation of dependencies

bedtools >=2.26.0
samtools >=1.3.0
minimap2 >=2.17
makeblastdb and windowmasker
hs-blastn ==0.0.5

Please make sure each executable binary is in PATH.

1. bedtools

Please visit here for instructions to install.

2. samtools

Please visit here for instructions to install.

3. minimap2

Please visit here for instructions to install.

4. makeblastdb and windowmasker

# Download NCBI-BLAST v2.3.0+ from NCBI FTP server
wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.3.0/ncbi-blast-2.3.0+-x64-linux.tar.gz

# Extract tar.gz
tar zxf ncbi-blast-2.3.0+-x64-linux.tar.gz

# Copy makeblastdb and windowmasker binaries to PATH (e.g. ~/bin)
cp ncbi-blast-2.3.0+/bin/makeblastdb ~/bin && cp ncbi-blast-2.3.0+/bin/windowmasker ~/bin

5. hs-blastn

# Download and compile the 0.0.5 version
git clone https://github.com/chenying2016/queries.git
cd queries/hs-blastn-src/v0.0.5
make

# Copy hs-blastn binary to path (e.g. ~/bin)
cp hs-blastn ~/bin

If you encounter "isnan" error during compilation, please refer to this.

Documentation

See wiki for more information.

Versioning

See CHANGELOG

Citation

If you use NanoVar, please cite:

Tham, CY., Tirado-Magallanes, R., Goh, Y. et al. NanoVar: accurate characterization of patients’ genomic structural variants using low-depth nanopore sequencing. Genome Biol. 21, 56 (2020). https://doi.org/10.1186/s13059-020-01968-7

Authors

Tham Cheng Yong - cytham
Roberto Tirado Magallanes - rtmag
Touati Benoukraf - benoukraflab

License

This project is licensed under GNU General Public License - see LICENSE.txt for details.

Simulation datasets and scripts used in the manuscript

SV simulation datasets used in the manuscript can be downloaded here. Scripts used for simulation dataset generation and tool performance comparison are available here.

Although NanoVar is provided with a universal model and threshold score, instructions required for building a custom neural-network model is available here.

Limitations

The inaccurate basecalling of large homopolymer or low complexity DNA regions may result in the false determination of deletion SVs. We advise the use of up-to-date ONT basecallers such as Guppy to minimize this possibility.
For BND SVs, NanoVar is unable to calculate the actual number of SV-opposing reads (normal reads) at the novel adjacency as there are two breakends from distant locations. It is not clear whether the novel adjacency is derived from both or either breakends, in cases of balanced and unbalanced variants, and therefore its not possible to know which breakend location(s) to consider for counting normal reads. Currently, NanoVar approximates the normal read count by the minimum count from either breakend location. Although this helps in capturing unbalanced BNDs, it might lead to some false positives.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.6.1

Mar 31, 2024

1.6.0

Jan 16, 2024

1.5.1

Jan 5, 2024

1.5.0

Sep 8, 2023

1.4.1

Oct 7, 2021

1.4.0

Sep 8, 2021

This version

1.3.9

Mar 24, 2021

1.3.8

May 24, 2020

1.3.7

May 23, 2020

1.3.6

Apr 17, 2020

1.3.5

Apr 1, 2020

1.3.4

Mar 19, 2020

1.3.2

Mar 4, 2020

1.3.1

Feb 29, 2020

1.2.7

Dec 15, 2019

1.2.6

Nov 28, 2019

1.2.5

Nov 25, 2019

1.2.4

Nov 25, 2019

1.2.3

Nov 25, 2019

1.2.2

Nov 25, 2019

1.2.1

Nov 24, 2019

1.2.0

Nov 21, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nanovar-1.3.9.tar.gz (396.5 kB view hashes)

Uploaded Mar 24, 2021 Source

Hashes for nanovar-1.3.9.tar.gz

Hashes for nanovar-1.3.9.tar.gz
Algorithm	Hash digest
SHA256	`472b3f9da25ba903bf5dfb222129cd89fbcfec94dacecf28e0b9ab65492fcba1`
MD5	`5b1b8d2356e08f80cabb7676d48e1b8c`
BLAKE2b-256	`6ee3b0d36d0acbd6b2b6877c700b2d7ec7dafe3f6338429b5c161458bd54db83`

nanovar 1.3.9

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

NanoVar - Structural variant caller using low-depth long-read sequencing

Basic capabilities

Getting Started

Quick run

Output

Operating system:

Installation:

Option 1: Conda (Recommended)

Option 2: Pip (See dependencies below)

Option 3: GitHub (See dependencies below)

Installation of dependencies

1. bedtools

2. samtools

3. minimap2

4. makeblastdb and windowmasker

5. hs-blastn

Documentation

Versioning

Citation

Authors

License

Simulation datasets and scripts used in the manuscript

Limitations

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution