A python library for calculating the delta score (Holland et al. 2002) and Q-Residual (Gray et al. 2010)
Project description
phylogemetric
A python library for calculating delta score (Holland et al. 2002) and Q-Residual (Gray et al. 2010) for phylogenetic data.
Installation:
Installation is only a pip install away:
pip install phylogemetric
Usage: Command line
Basic usage:
> phylogemetric
usage: phylogemetric [-h] method filename
Calculate delta score for filename example.nex:
> phylogemetric delta example.nex
taxon1 0.2453
taxon2 0.2404
taxon3 0.2954
...
Calculate qresidual score for filename example.nex:
> phylogemetric qresidual example.nex
taxon1 0.0030
taxon2 0.0037
taxon3 0.0063
...
Note: to save the results to a file use shell piping e.g.:
> phylogemetric qresidual example.nex > qresidual.txt
Speeding things up by using multiple processes.
You can tell phylogemetric to use multiple cores with the -w/--workers
argument:
> phylogemetric -w 4 qresidual example.nex
Usage: Library
Calculate scores:
from nexus import NexusReader
from phylogemetric import DeltaScoreMetric
from phylogemetric import QResidualMetric
# load data from a nexus file:
nex = NexusReader("filename.nex")
qres = QResidualMetric(nex.data.matrix)
# Or construct a data matrix directly:
matrix = {
'A': [
'1', '1', '1', '1', '0', '0', '1', '1', '1', '0', '1', '1',
'1', '1', '0', '0', '1', '1', '1', '0'
],
'B': [
'1', '1', '1', '1', '0', '0', '0', '1', '1', '1', '1', '1',
'1', '1', '1', '0', '0', '1', '1', '1'
],
'C': [
'1', '1', '1', '1', '1', '1', '1', '0', '1', '1', '1', '0',
'0', '0', '0', '1', '0', '1', '1', '1'
],
'D': [
'1', '0', '0', '0', '0', '1', '0', '1', '1', '1', '1', '0',
'0', '0', '0', '1', '0', '1', '1', '1'
],
'E': [
'1', '0', '0', '0', '0', '1', '0', '1', '0', '1', '1', '0',
'0', '0', '0', '1', '1', '1', '1', '1'
],
}
delta = DeltaScoreMetric(matrix)
Class Methods:
m = DeltaScoreMetric(matrix)
# calculates the number of quartets in the data:
m.nquartets()
# returns the distance between two sequences:
m.dist(['1', '1', '0'], ['0', '1', '0'])
# gets a dictionary of metric scores:
m.score()
m.score(workers=4) # with multiple processes.
# pretty prints the metric scores:
m.pprint()
Requirements:
- python-nexus >= 1.1
Performance Notes:
Currently phylogemetric is implemented in python, and the Delta/Q-Residual algorithms are O(n). This means
that performance is not optimal, and it may take a while to calculate these metrics for datasets with more than
100 taxa or so. To help speed this up, use the multiple processes argument -w/--workers
at the command line or by passing workers=n
to the score
function.
I hope to improve performance in the near future, but in the meantime, if this is an issue for you then try using the implementations available in SplitsTree.
Citation:
If you use phylogemetric, please cite:
Greenhill, SJ. 2016. Phylogemetric: A Python library for calculating phylogenetic network metrics. Journal of Open Source Software.
http://dx.doi.org/10.21105/joss.00028
Changelog:
- 1.1.0:
- Added support for multiple processes.
- Removed python 2 support.
Acknowledgements:
- Thanks to David Bryant for clarifying the Q-Residual code.
- Thanks to Kristian Rother for code quality suggestions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for phylogemetric-1.2.2-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95f27dd6e4c23a7b7e0c583c2629a196a93eb63352e30a59725265314b3a78ae |
|
MD5 | e44d0bfec5357b43fdd04c5d1f62a999 |
|
BLAKE2b-256 | 533f64e666e88aea0ba99661902ccc107c2630f8b2e3b29805ad302771e2357f |