Skip to main content

Sexual inference based on mitochondrial genome content

Project description

Description

MyToSex is a novel tool for in silico sex determination based on the mitochondrial genomes content. Some mussels and clams species have an unusual system of mitochondrial inheritance termed double uniparental inheritance, which involves the transmission of two different sex-associated mitogenomes haplotypes to the offspring. Females contain only F-type mitogenomes (mtF) whereas males carry both haplotypes mtF, and also M-type (mtM). This tools works in two different acts:

  1. Mitogenomes detection and quantification: To detect the mitogenomes presence, we mapped all the reads to both mitotypes. From this alignment we extracted some metrics that are used to determine the sex.

  2. Additional analyses: We also implemented two additional analyses to complement and bring more support the results.

    1. Samples clustering: We extracted multiples metrics from the reads alignments to the mitogenomes and applying a dimensional reduction (UMAP) to verify if the resultant clustering agree with the sex-determination results obtained previously.
    2. Phylogenetic analysis of protein-coding mitogenes: We use the reads that mapped to the mitogenomes to assemble de novo the protein-coding genes which are used to perform a phylogenetic analysis incorporating the mitogenes of reference and also adding information from other species.

Installation

MyToSex is an open-source tools written in Python3 that requires the following modules: PyYAML. Furthermore, it also calls third-party software.

Citation

If you only use MyToSex cite us as follows:

Mendoza M. and Canchaya A., MyToSex: Sexual inference based on mitochondrial genome content [...]

Please, also include to:

  • Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie2. Nat Methods. 2012;9(4):357-359. doi:10.1038/nmeth.1923.

    @article{langmead2012bowtie2, 
      title={Fast gapped-read alignment with Bowtie2},
      author={Langmead, Ben and Salzberg, Steven L},
      journal={Nature methods},
      volume={9},
      number={4},
      pages={357--359},
      year={2012},
      publisher={Nature Publishing Group}
    }
    
  • Li H, Handsaker B, Wysoker A, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078-2079. doi:10.1093/bioinformatics/btp352.

    @article{li2009samtools,
      title={The sequence alignment/map format and SAMtools},
      author={Li, Heng and Handsaker, Bob and Wysoker, Alec and Fennell, Tim and Ruan, Jue and Homer, Nils and Marth, Gabor and Abecasis, Goncalo and Durbin, Richard},
      journal={Bioinformatics},
      volume={25},
      number={16},
      pages={2078--2079},
      year={2009},
      publisher={Oxford University Press}
    }
    

If you also perform the supporting analysis, please cite them too.

  • Samples clustering:

    • McInnes, L., Healy, J. and Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv preprint, arXiv:1802.03426.
      @misc{mcinnes2020umap,
        title={UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction}, 
        author={Leland McInnes and John Healy and James Melville},
        year={2020},
        eprint={1802.03426},
        archivePrefix={arXiv}
      } 
      
  • Phylogenetic analysis:

    • Haas BJ, Papanicolaou A, Yassour M, et. al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8(8):1494-1512. doi:10.1038/nprot.2013.084
      @article{haas2013trinity,
        title={De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis},
        author={Haas, Brian J and Papanicolaou, Alexie and Yassour, Moran and Grabherr, Manfred and Blood, Philip D and Bowden, Joshua and Couger, Matthew Brian and Eccles, David and Li, Bo and Lieber, Matthias and others},
        journal={Nature protocols},
        volume={8},
        number={8},
        pages={1494--1512},
        year={2013},
        publisher={Nature Publishing Group}
      }
      
    • Camacho C, Coulouris G, Avagyan V, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421. doi:10.1186/1471-2105-10-421.
      @article{camacho2009blast+,
        title={BLAST+: architecture and applications},
        author={Camacho, Christiam and Coulouris, George and Avagyan, Vahram and Ma, Ning and Papadopoulos, Jason and Bealer, Kevin and Madden, Thomas L},
        journal={BMC bioinformatics},
        volume={10},
        number={1},
        pages={421--429},
        year={2009},
        publisher={Springer}
      }
      
    • Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33(2):511-518. doi:10.1093/nar/gki198.
      @article{katoh2005mafft,
        title={MAFFT version 5: improvement in accuracy of multiple sequence alignment},
        author={Katoh, Kazutaka and Kuma, Kei-ichi and Toh, Hiroyuki and Miyata, Takashi},
        journal={Nucleic acids research},
        volume={33},
        number={2},
        pages={511--518},
        year={2005},
        publisher={Oxford University Press}
      }
      
    • Darriba D, Posada D, Kozlov AM, Stamatakis A, Morel B, Flouri T. ModelTest-NG: A New and Scalable Tool for the Selection of DNA and Protein Evolutionary Models. Mol Biol Evol. 2020;37(1):291-294. doi:10.1093/molbev/msz189.
      @article{darriba2020modeltest,
        title={ModelTest-NG: a new and scalable tool for the selection of DNA and protein evolutionary models},
        author={Darriba, Diego and Posada, David and Kozlov, Alexey M and Stamatakis, Alexandros and Morel, Benoit and Flouri, Tomas},
        journal={Molecular Biology and Evolution},
        volume={37},
        number={1},
        pages={291--294},
        year={2020},
        publisher={Oxford University Press}
      }
      
    • Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312-1313. doi:10.1093/bioinformatics/btu033.
      @article{stamatakis2014raxml,
        title={RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies},
        author={Stamatakis, Alexandros},
        journal={Bioinformatics},
        volume={30},
        number={9},
        pages={1312--1313},
        year={2014},
        publisher={Oxford University Press}
      }
      

Acknowledgement

This work was supported by the European Social Fund and the Government of Xunta de Galicia (Scholarship reference ED481A-2018/305 awarded by Manuel Mendoza).

We developed this tools using the computational resources of the Supercomputing Center of Galicia (CESGA) using Pycharm with an Academic License freely provided for JetBrain.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mytosex-0.1.tar.gz (4.9 kB view details)

Uploaded Source

Built Distribution

mytosex-0.1-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file mytosex-0.1.tar.gz.

File metadata

  • Download URL: mytosex-0.1.tar.gz
  • Upload date:
  • Size: 4.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.0.1 pkginfo/1.8.2 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.8

File hashes

Hashes for mytosex-0.1.tar.gz
Algorithm Hash digest
SHA256 b5ef7589abfcdcb3901d2400d9ef17f8243e92c1ef606c9cad426beb153b7782
MD5 5e92cb5e02d9f7c5ed9851bbcbca705b
BLAKE2b-256 3066c1e7fee47e62454127f44bce644ca7db41e7860a045cc068cb5b2912b75f

See more details on using hashes here.

File details

Details for the file mytosex-0.1-py3-none-any.whl.

File metadata

  • Download URL: mytosex-0.1-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.0.1 pkginfo/1.8.2 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.54.0 CPython/3.7.8

File hashes

Hashes for mytosex-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ec8fb267e03c27988fe72e54595a77cd9e837b9db084c2497f38cd4e9887e487
MD5 1336a62cd955e6a5e56088e0d1f7978c
BLAKE2b-256 83b90b5cd26891e5be9bb2dcdfcd7b75f2e47ab7dfdbcb72f04eef0d51a6e821

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page