Command-line tool for simulating predictive datasets from MrBayes' output.
Project description
predsim is a command-line tool for simulating predictive datasets from MrBayes output files. Datasets can be simulated under the GTR+G+I substitution model or any nested variant available in MrBayes (JC69, HKY85 etc.). The script uses Seq-Gen for simulating the DNA-sequences and builds on the third-party libraries DendroPy and pandas.
The code has been tested with Python 2.7, 3.3, 3.4 and 3.5.
Source repository: https://github.com/jmenglund/predsim
Requirements
Seq-Gen must be installed on your system.
Installation
For most users, the easiest way is probably to install the latest version hosted on PyPI:
$ pip install predsim
The project is hosted at https://github.com/jmenglund/predsim and can also be installed using git:
$ git clone https://github.com/jmenglund/predsim.git
$ cd predsim
$ python setup.py install
You may consider installing predsim and its required Python packages within a virtual environment in order to avoid cluttering your system’s Python path. See for example the environment management system conda or the package virtualenv.
Usage
$ predsim --help
usage: predsim [-h] [-V] [-l INT] [-g INT] [-c FILE] [-s INT] [-p FILE]
pfile tfile [outfile]
A command-line utility that reads posterior output of MrBayes and simulates
predictive datasets with Seq-Gen.
positional arguments:
pfile path to a MrBayes p-file
tfile path to a MrBayes t-file
outfile path to output file (default: <stdout>)
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-l INT, --length INT sequence lenght (default: 1000)
-g INT, --gamma-cats INT
number of gamma rate categories (default: continuous)
-c FILE, --commands-file FILE
path to output file with used Seq-Gen commands
-s INT, --skip INT number of records (trees) to skip at the beginning of
the sample (default: 0)
-p FILE, --seqgen-path FILE
path to a Seq-Gen executable (default: "seq-gen")
It is strongly recommended that you use the -c FILE option to check the commands run by Seq-Gen.
Depending on your Python version, you might need to specify the full path to your Seq-Gen executable with the -p FILE option.
Running tests
Testing is carried out with pytest. Here is an example on how to run the test suite and generating a coverage report:
$ cd predsim
$ pip install pytest pytest-cov pytest-pep8
$ py.test -v --cov-report term-missing --cov predsim.py --pep8
License
predsim is distributed under the MIT license.
Citing
If you use results produced with this package in a scientific publication, please just mention the package name in the text and cite the Zenodo DOI of this project:
You can select a citation style from the dropdown menu in the “Cite as” section on the Zenodo page.
predsim relies on other software that also should be cited. Below are suggested citations for Seq-Gen, DendroPy and pandas, respectively:
Rambaut A, Grassly NC. 1997. Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Comput. Appl. Biosci. 13:235–238.
Sukumaran J, Holder MT. 2010. DendroPy: a Python library for phylogenetic computing. Bioinformatics 26:1569–1571.
McKinney W. 2010. Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference (van der Walt S, Millman J, editors), pages 51–56.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for predsim-0.1.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 464075b2ffd0843bfcbaf898681f217875b9c371d4dcd8c96c9472bcc460e8d2 |
|
MD5 | ae3febe9fc68bc807a4e007cf89d0d90 |
|
BLAKE2b-256 | a7895fc0a52ebcb5d5ba9d4c6ddc7e66caa222abc69d4a7197c23823d724deb5 |