Skip to main content

A sensitive Mitochondrial variant detection pipeline from WGS data

Project description

mity

mity is a bioinformatic analysis pipeline designed to call mitochondrial SNV and INDEL variants from Whole Genome Sequencing (WGS) data. mity can:

  • identify very low-heteroplasmy variants, even <1% heteroplasmy when there is sufficient read-depth (eg >1000x)
  • filter out common artefacts that arise from high-depth sequencing
  • easily integrate with existing nuclear DNA analysis pipelines (mity merge)
  • provide an annotated report, designed for clinicians and researchers to interrogate

Usage

mity -h

Dependencies

Installation

Installation instructions via Docker, pip, or manually are available in INSTALL.md

Example Usage

This is an example of calling variants in the Ashkenazim Trio.

mity-call

First run mity-call on three MT BAMs provided in mity/test_in

We can run it in normalised mode & recommend always using --normalise (or mity report won't work):

mity call \
--prefix ashkenazim \
--out-folder-path test_out \
--region MT:1-500 \
--normalise \
test_in/HG002.hs37d5.2x250.small.MT.RG.bam \
test_in/HG003.hs37d5.2x250.small.MT.RG.bam \
test_in/HG004.hs37d5.2x250.small.MT.RG.bam 

This will create test_out/normalised/ashkenazim.mity.vcf.gz (and tbi file).

mity-report

We can create a mity report on the normalised VCF:

mity report \
--prefix ashkenazim \
--min_vaf 0.01 \
--out-folder-path test_out \
test_out/ashkenazim.mity.vcf.gz

This will create: test_out/ashkenazim.annotated_variants.csv and test_out/ashkenazim.annotated_variants.xlsx.

mity-normalise

High-depth sequencing and sensitive variant calling can create many variants with more than 2 alleles, and in some cases, joins two nearby variants separated by shared REF sequenced into a multi-nucleotide polymorphism as discussed in the manuscript. Here, variant normalisation relates to decomposing the multi-allelic variants and where possible, splitting multi-nucleotide polymorphisms into their cognate smaller variants. At the time of writing, all variant decomposition tools we used failed to propagate the metadata in a multi-allelic variant to the split variants which caused problems when reporting the quality scores associated with each variant.

Technically you can run mity call and mity normalise separately, but since mity report requires a normalised vcf file, we recommend running mity call --normalise.

mity-merge

You can merge a nuclear vcf.gz file and a mity.vcf.gz file thereby replacing the MT calls from the nuclear VCF ( presumably from a caller like HaplotypeCaller which is not able to sensitively call mitochondrial variants) with the calls from mity.

mity merge \
--prefix ashkenazim \
--mity_vcf test_out/ashkenazim.mity.vcf.gz \
--nuclear_vcf todo-create-example-nuclear.vcf.gz

Recommendations for interpreting the report

Assuming that you are looking for a pathogenic variant underlying a patient with a rare genetic disorder potentially caused by a Mitochondrial mutation, then we recommend the following strategy:

  1. tier 1 or 2 variants included in the 'commercial_panels' column
  2. tier 1 or 2 variants that match the clinical presentation and the phenotype in 'disease_mitomap', preferably those that are annotated with Confirmed evidence in the 'status_mitomap' column
  3. exclude common variants: anything linked to 'phylotree_haplotype', high 'phylotree_haplotype', high 'MGRB_frequency', high 'GenBank_frequency_mitomap'.
  4. consider any remaining tier 1 or 2 variants that may have a predicted impact on tRNA
  5. consider any remaining variants with high numbers of 'variant_references_mitomap'
  6. if you have analysed multiple family members, consider variants who's level of 'variant_heteroplasmy' match the disease burden

Acknowledgements

We would like to thank

  • The Kinghorn Centre for Clinical Genomics and collaborators, who helped with feedback for running mity.
  • The Genome in a Bottle consortium for providing the test data used here
  • Eric Talevich who's CNVkit helped us structure mity as a package
  • Erik Garrison for developing FreeBayes and his early feedback in optimising FreeBayes for sensitive variant detection.
  • Brent Pederson for developing gsort

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mitywgs-0.1.0.tar.gz (37.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mitywgs-0.1.0-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file mitywgs-0.1.0.tar.gz.

File metadata

  • Download URL: mitywgs-0.1.0.tar.gz
  • Upload date:
  • Size: 37.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.4

File hashes

Hashes for mitywgs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6b6474b46cd5de8785887fab7f0a0d942e6f7e53dfb94f663f89b2ed28141560
MD5 115748adb83b7c579419f12902de455c
BLAKE2b-256 86fc4260dd9cae80420e003f605e31b54e73af7fb685625664d1ccf0435ac64e

See more details on using hashes here.

File details

Details for the file mitywgs-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mitywgs-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.4

File hashes

Hashes for mitywgs-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 90e433721fea608ae7c7843812b355412ebe36937935f260e72a87ab21f6aefd
MD5 60a78eb48115118670092198298b1568
BLAKE2b-256 955f0fba9bf0dc8952226a3f354f63799ffc90c6316bd29655a65af24a4eb38a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page