Global Hybridization detection using phylogenetic invariants
Project description
GHDet: Global Hybridization Detection using Phylogenetic Invariants
GHDet is a python package that can detect if there is any hybridization event given a set of taxa. The main module of this package is called pyghdet (Pythonic Global Hybridization Detection). Using this package, we can test whether there has been any hybridization event among a set of taxa, helping us decide whether or not to perform tree or network analysis.
Installation
Requirements:
Python 3.6 or above
Python Modules:
Cython
Numpy
Matplotlib
Seaborn
Multiprocess
math
scipy
phyde
pandas
itertools
typing
C++ compiler
# To install dependencies
pip install cython numpy matplotlib seaborn multiprocess math scipy phyde pandas itertools typing
# To install GHDet
pip install pyghdet
Usages
The package has two main modules: comb_species and comb_indiv. The comb_species module performs global hypothesis test to detect if any of the species is a hybrid or not. Similarly, comb_indiv module performs global hypothesis test to detect if there is any hybrid individual in the data. The comb_species and comb_indiv modules has the following arguments:
Arguments
---------
- infile <string> : name of the DNA sequence data file.
- mapfile <string> : name of the taxon map file.
- outgroup <string> : name of the outgroup.
- nindiv <int> : number of sampled individuals.
- ntaxa <int> : number of sampled taxa/populations.
- nsites <int> : number of sampled sites.
- sus_hyb <string>: list of suspected hybrid species.
- alpha <float>: intended level of significance.
- ignore_amb_sites <flag> : ignore missing/ambiguous sites.
Examples
Detect if any species is hybrid:
## No suspected hybrid:
import pyghdet as ghd
res = ghd.comb_speices("data.txt", "map.txt", "out", 16, 4, 50000)
res.p_value
res.detailed
## with suspected hybrid:
import pyghdet as ghd
res = ghd.comb_species("data.txt", "map.txt", "out", 16, 4, 50000, ['sp1', 'sp2'])
res.p_value
res.detailed
Detect if any individual is hybrid:
## No suspected hybrid:
import pyghdet as ghd
res = ghd.comb_indiv("data.txt", "map.txt", "out", 16, 4, 50000)
res.p_value
res.detailed
## with suspected hybrid:
import pyghdet as ghd
res = ghd.comb_indiv("data.txt", "map.txt", "out", 16, 4, 50000, ['sp1', 'sp2'])
res.p_value
res.detailed
Running GHDet from command line
To facilitate analyses using the Python module, two scripts are provided to conduct hybridization detection analyses directly from the command line:
hdet_species.py: runs a standard hybridization detection analysis on all the species in the data. The output is the p-value of the global test. The script also produce results on individual tests if the global test is significant. One can also provide a list of suspected hybrid, which only test if any of the suspected hybrid species is actually hybrid species or not. evidence for hybridization.
hdet_indiv.py: tests if any of the individuals in the data is hybrid or not. It is possible to provide a list of suspected hybrid species, then the test will only detect individuals from the provided suspected hybrid species are hybrid or not.
Examples
## species level
hdet_species.py -i data.txt -m map.txt -o out -n 16 -t 4 -s 50000
## individual level
hdet_indiv.py -i data.txt -m map.txt -o out -n 16 -t 4 -s 50000
Performing Combination tests
The package can also be used to perform Combination tests. Three different combination tests have been implemented in the package. The tests are: Cauchy Combination test (CCT), MinP-CCT-MinP (MCM) test, and CCT-MinP-CCT-(CMC) test. These tests can be performed as follows:
import pyghdet as ghd
## CCT with different weights
ghd.cct([0.01,0.05,0.10, 0.53], [1,1,2,1])
## CCT with equal weights
ghd.cct([0.01, 0.10, 0.05, 0.54])
## CMC test
ghd.cmc([0.01,0.05,0.10,0.45])
## MCM test
ghd.mcm([0.01, 0.05, 0.40, 0.33])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyghdet-0.3.5.tar.gz.
File metadata
- Download URL: pyghdet-0.3.5.tar.gz
- Upload date:
- Size: 20.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d40c2eb7b8070ea433f8d5062098e3f130ae6e2e076baabfd8650fb6d822a744
|
|
| MD5 |
54ab63cd1e2e06293b433390ab30d4ff
|
|
| BLAKE2b-256 |
5fffe354a6d36aaf96a0873a197f9fd9493fa973c2bcd64f073290126a22f94e
|
File details
Details for the file pyghdet-0.3.5-py3-none-any.whl.
File metadata
- Download URL: pyghdet-0.3.5-py3-none-any.whl
- Upload date:
- Size: 21.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1cd1b982bc35d26161cb48cb95387cede12fb03702f5dd9065ab55f2d48be2e
|
|
| MD5 |
02dd0e2187d0c403ce2ce6342b379e1e
|
|
| BLAKE2b-256 |
263af096e6e3c8bbfbd49df1fd46dc1882cd7284f3ba1c7232a06af4811c7735
|