Mycorrhiza population assignment tools.
Project description
Mycorrhiza
Combining phylogenetic networks and Random Forests for prediction of ancestry from multilocus genotype data.
Running an analysis from command line (OPTION 1)
-
Install Docker
Instructions can be found here.
-
(On linux, optional) Give Docker root access.
-
Get the Mycorrhiza image.
docker pull jgeofil/mycorrhiza:latest
-
Run an analysis.
Example data can be found here.
docker run -v [WORKING DIRECTORY]:/temp/ mycorrhiza crossvalidate -i /temp/[INPUT FILE] -o /temp
For example, in a folder containing the input file gipsy.myc.
docker run -v $PWD:/temp/ mycorrhiza crossvalidate -i /temp/gipsy.myc -o /temp
Running an analysis from command line (OPTION 2)
-
Make sure you have the latest version of Python 3.x
python --version
-
Install pip
-
Install Mycorrhiza
pip3 install --upgrade mycorrhiza
-
Install SplitsTree
Installation executables for SplitsTree4 can be found here.
-
Install matplotlib
Instructions can be found here.
-
Run an analysis.
crossvalidate -h crossvalidate -i gipsy.myc -o out/
It may be necessary to add to the PATH
export PATH=$PATH:$HOME/bin
Running an analysis in a script
Installing Mycorrhiza with pip
-
Make sure you have the latest version of Python 3.x
python --version
-
Install pip
-
Install Mycorrhiza
pip3 install --upgrade mycorrhiza
-
Install SplitsTree
Installation executables for SplitsTree4 can be found here.
-
Install matplotlib
Instructions can be found here.
Running an analysis in a script
-
Import the necessary modules.
from mycorrhiza.dataset import Myco from mycorrhiza.analysis import CrossValidate from mycorrhiza.plotting.plotting import mixture_plot
-
(Optional) By default Mycorrhiza will look for SplitStree in your PATH. I you wish to specify a different path for the SplitsTree executable you can do so in the settings module.
from mycorrhiza.settings import const const['__SPLITSTREE_PATH__'] = 'SplitsTree'
-
Load some data. Here data is loaded in the Mycorrhiza format from the Gipsy moth sample data file. Example data can be found here.
myco = Myco(file_path='data/gipsy.myc') myco.load()
-
Run an analysis. Here a simple 5-fold cross-validation analysis is executed on all available loci, without partitioning.
cv = CrossValidate(dataset=myco, out_path='data/') cv.run(n_partitions=1, n_loci=0, n_splits=5, n_estimators=60, n_cores=1)
-
Plot the results.
mixture_plot(cv)
Documentation
https://jgeofil.github.io/mycorrhiza/
File formats
Myco
Diploid genotypes occupy 2 rows (the sample identifier must be identical).
Column(s) | Content | Type |
---|---|---|
1 | Sample identifier | string |
2 | Population | string or integer |
3 | Learning flag | {0,1} |
4 to M+3 | Loci | {A, T, G, C, N} |
STRUCTURE
Diploid genotypes occupy 2 rows (the sample identifier must be identical).
Column(s) | Content | Type |
---|---|---|
1 | Sample identifier | string |
2 | Population | integer |
3 | Learning flag | {0,1} |
4 to O+3 | Optional (Ignored) | |
O+3 to M+O+3 | Loci | integer or -9 |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.