Skip to main content

Global Hybridization detection using phylogenetic invariants

Project description

GHDet: Global Hybridization Detection using Phylogenetic Invariants

GHDet is a python package that can detect if there is any hybridization event given a set of taxa. The main module of this package is called pyghdet (Pythonic Global Hybridization Detection). Using this package, we can test whether there has been any hybridization event among a set of taxa, helping us decide whether or not to perform tree or network analysis.

Installation

Requirements:

  • Python 3.6 or above

  • Python Modules:

    • Cython

    • Numpy

    • Matplotlib

    • Seaborn

    • Multiprocess

    • math

    • scipy

    • phyde

    • pandas

    • itertools

    • typing

  • C++ compiler

# To install dependencies
pip install cython numpy matplotlib seaborn multiprocess math scipy phyde pandas itertools typing

# To install GHDet
pip install pyghdet

Usages

The package has two main modules: comb_species and comb_indiv. The comb_species module performs global hypothesis test to detect if any of the species is a hybrid or not. Similarly, comb_indiv module performs global hypothesis test to detect if there is any hybrid individual in the data. The comb_species and comb_indiv modules has the following arguments:

Arguments
---------

    - infile         <string> : name of the DNA sequence data file.
    - mapfile        <string> : name of the taxon map file.
    - outgroup       <string> : name of the outgroup.
    - nindiv            <int> : number of sampled individuals.
    - ntaxa             <int> : number of sampled taxa/populations.
    - nsites            <int> : number of sampled sites.
    - sus_hyb         <string>: list of suspected hybrid species.
    - alpha            <float>: intended level of significance.
    - ignore_amb_sites <flag> : ignore missing/ambiguous sites.

Examples

Detect if any species is hybrid:

## No suspected hybrid:

  import pyghdet as ghd
  res = ghd.comb_speices("data.txt", "map.txt", "out", 16, 4, 50000)
  res.p_value
  res.detailed


## with suspected hybrid:
  import pyghdet as ghd
  res = ghd.comb_species("data.txt", "map.txt", "out", 16, 4, 50000, ['sp1', 'sp2'])
  res.p_value
  res.detailed

Detect if any individual is hybrid:

## No suspected hybrid:

  import pyghdet as ghd
  res = ghd.comb_indiv("data.txt", "map.txt", "out", 16, 4, 50000)
  res.p_value
  res.detailed


## with suspected hybrid:
  import pyghdet as ghd
  res = ghd.comb_indiv("data.txt", "map.txt", "out", 16, 4, 50000, ['sp1', 'sp2'])
  res.p_value
  res.detailed

Running GHDet from command line

To facilitate analyses using the Python module, two scripts are provided to conduct hybridization detection analyses directly from the command line:

  • hdet_species.py: runs a standard hybridization detection analysis on all the species in the data. The output is the p-value of the global test. The script also produce results on individual tests if the global test is significant. One can also provide a list of suspected hybrid, which only test if any of the suspected hybrid species is actually hybrid species or not. evidence for hybridization.

  • hdet_indiv.py: tests if any of the individuals in the data is hybrid or not. It is possible to provide a list of suspected hybrid species, then the test will only detect individuals from the provided suspected hybrid species are hybrid or not.

Examples

## species level

hdet_species.py -i data.txt -m map.txt -o out -n 16 -t 4 -s 50000

## individual level

hdet_indiv.py -i data.txt -m map.txt -o out -n 16 -t 4 -s 50000

Performing Combination tests

The package can also be used to perform Combination tests. Three different combination tests have been implemented in the package. The tests are: Cauchy Combination test (CCT), MinP-CCT-MinP (MCM) test, and CCT-MinP-CCT-(CMC) test. These tests can be performed as follows:

import pyghdet as ghd

## CCT with different weights
ghd.cct([0.01,0.05,0.10, 0.53], [1,1,2,1])

## CCT with equal weights
ghd.cct([0.01, 0.10, 0.05, 0.54])

## CMC test
ghd.cmc([0.01,0.05,0.10,0.45])

## MCM test
ghd.mcm([0.01, 0.05, 0.40, 0.33])

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyghdet-0.3.5.tar.gz (20.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyghdet-0.3.5-py3-none-any.whl (21.1 kB view details)

Uploaded Python 3

File details

Details for the file pyghdet-0.3.5.tar.gz.

File metadata

  • Download URL: pyghdet-0.3.5.tar.gz
  • Upload date:
  • Size: 20.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for pyghdet-0.3.5.tar.gz
Algorithm Hash digest
SHA256 d40c2eb7b8070ea433f8d5062098e3f130ae6e2e076baabfd8650fb6d822a744
MD5 54ab63cd1e2e06293b433390ab30d4ff
BLAKE2b-256 5fffe354a6d36aaf96a0873a197f9fd9493fa973c2bcd64f073290126a22f94e

See more details on using hashes here.

File details

Details for the file pyghdet-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: pyghdet-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 21.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for pyghdet-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d1cd1b982bc35d26161cb48cb95387cede12fb03702f5dd9065ab55f2d48be2e
MD5 02dd0e2187d0c403ce2ce6342b379e1e
BLAKE2b-256 263af096e6e3c8bbfbd49df1fd46dc1882cd7284f3ba1c7232a06af4811c7735

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page