Skip to main content

A structural variant caller for long reads.

Project description

https://badge.fury.io/py/svim.svg https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg

SVIM (pronounced SWIM) is a structural variant caller for long reads. It is able to detect and classify five different classes of structural variants. Unlike existing methods, SVIM integrates information from across the genome to precisely distinguish similar events, such as tandem and interspersed duplications and novel element insertions. In our experiments on simulated data and real datasets from PacBio and Nanopore sequencing machines, SVIM reached consistently better results than competing methods. Furthermore, it is unique in its capability of extracting both the genomic origin and destination of duplications.

Background on Structural Variants and Long Reads

https://raw.githubusercontent.com/eldariont/svim/master/docs/SVclasses.png

Structural variants (SVs) are typically defined as genomic variants larger than 50bps (e.g. deletions, duplications, inversions). Studies have shown that they affect more bases in any given genome than SNPs and small Indels taken together. Consequently, they have a large impact on genes and regulatory regions. This is reflected in the large number of genetic diseases that are caused by SVs.

Common sequencing technologies by providers such as Illumina generate short reads with high accuracy. However, they exhibit weaknesses in repeat and low-complexity regions. This negatively affects SV detection because SVs are associated to such regions. Single molecule long-read sequencing technologies from Pacific Biotechnologies and Oxford Nanopore produce reads with error rates of up to 15% but with lengths of several kb. The high read lengths enable them to cover entire repeats and SVs which facilitates SV detection.

Installation

#Install via conda: easiest option, installs all dependencies including read alignment dependencies
conda install --channel bioconda svim

#Install via pip (requires Python 3.6.*): installs all dependencies except those necessary for read alignment (ngmlr, minimap2, samtools)
pip3 install svim

#Install from github (requires Python 3.6.*): installs all dependencies except those necessary for read alignment (ngmlr, minimap2, samtools)
git clone https://github.com/eldariont/svim.git
cd svim
pip3 install .

Input

SVIM analyzes long reads given as a FASTA/FASTQ file (uncompressed or gzipped) or a file list. Alternatively, it can analyze an alignment file in BAM format. SVIM was tested on both PacBio and Nanopore data. It works best for alignment files produced by NGMLR but also supports the faster read mapper minimap2.

Output

SVIM distinguishes five different SV classes (see above schema): deletions, inversions, tandem and interspersed duplications and novel insertions. Additionally, SVIM indicates for detected interspersed duplications whether the genomic origin location seems to be deleted in at least one haplotype (indicating a cut&paste insertion) or not (indicating a canonic interspersed duplication). For each of these SV classes, SVIM produces a BED file with the SV coordinates. Additionally, a VCF file is produced containing all found SVs.

Usage

Please see our wiki.

Contact

If you experience problems or have suggestions please create an issue or a pull request or contact heller_d@molgen.mpg.de.

License

The project is licensed under the GNU General Public License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

svim-0.4.3.tar.gz (37.6 kB view details)

Uploaded Source

Built Distribution

svim-0.4.3-py3-none-any.whl (62.5 kB view details)

Uploaded Python 3

File details

Details for the file svim-0.4.3.tar.gz.

File metadata

  • Download URL: svim-0.4.3.tar.gz
  • Upload date:
  • Size: 37.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.13.0 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.7

File hashes

Hashes for svim-0.4.3.tar.gz
Algorithm Hash digest
SHA256 370316a54e27eea0557dc7093313df4896bd689f56a89b036dc7eb686805f081
MD5 a80176e3cfefb3ac6de69691ef5b0413
BLAKE2b-256 bd59ca7641030d90f12deeffc982cc121c6c3dfb7eba3d7b0e317e1a3fb8614e

See more details on using hashes here.

File details

Details for the file svim-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: svim-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 62.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.13.0 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.7

File hashes

Hashes for svim-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 1a8e9b1e9988734d111556d40116a3a4c50f2a48248a69e5e31c54b3b98965c0
MD5 1892ab65a167041e0c13a2a1387465a7
BLAKE2b-256 bf0f0d879ecb24bc31c096e556d7d7b2c629830283b8bdb4d3cb0147975a28a8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page