Skip to main content

Phylodynamic paramater and model inference using pretrained deep neural networks

Project description

PhyloDeep

PhyloDeep is a python library for parameter estimation and model selection from phylogenetic trees, based on deep learning.

For more information on the method, please refer to the preprint: Voznica J, Zhukova A, Boskova V, Saulnier E, Lemoine F, Moslonka-Lefebvre M, Gascuel O (2021) Deep learning from phylogenies to uncover the transmission dynamics of epidemics. bioRxiv

Installation

PhyloDeep is available for Python 3.6 on pip.

Windows

For Windows users, we recommend installing phylodeep via Cygwin environment. First instal Python 3.6 and pip3 from the Cygwin packages. Then install phylodeep:

pip3 install phylodeep

All other platforms

You can install phylodeep for Python 3.6 with or without conda, following the procedures described below:

Installing with conda

Once you have conda installed, create an environment for phylodeep with Python 3.6 (here we name it phyloenv):

conda create --name phyloenv python=3.6

Then activate it:

conda activate phyloenv

Then install phylodeep in it:

pip install phylodeep

Installing without conda

Make sure that Pyhon 3.6 and pip3 are installed, then install phylodeep:

pip3 install phylodeep

Usage

If you installed phylodeep with conda, do not forget to activate the corresponding environment (e.g. phyloenv) before using PhyloDeep:

conda activate phyloenv

We recommend to perform a priori model adequacy first to assess whether the input data resembles well the simulations on which the neural networks were trained.

Python

from phylodeep import BD, BDEI, BDSS, SUMSTATS, FULL
from phylodeep.checkdeep import checkdeep
from phylodeep.modeldeep import modeldeep
from phylodeep.paramdeep import paramdeep


path_to_tree = './Zurich.trees'

# set presumed sampling probability
sampling_proba = 0.25

# a priori check for models BD, BDEI, BDSS
checkdeep(path_to_tree, model=BD, outputfile_png='BD_a_priori_check.png')
checkdeep(path_to_tree, model=BDEI, outputfile_png='BDEI_a_priori_check.png')
checkdeep(path_to_tree, model=BDSS, outputfile_png='BDSS_a_priori_check.png')


# model selection
model_BDEI_vs_BD_vs_BDSS = modeldeep(path_to_tree, sampling_proba, vector_representation=FULL)

# the selected model is BDSS

# parameter inference
param_BDSS = paramdeep(path_to_tree, sampling_proba, model=BDSS, vector_representation=FULL, 
                                 ci_computation=True)

# for the interpretation of results, please see below

Command line

# we use here a tree of 200 tips

# a priori model adequacy check: highly recommended
checkdeep -t ./Zurich.trees -m BD -o BD_model_adequacy.png
checkdeep -t ./Zurich.trees -m BDEI -o BDEI_model_adequacy.png
checkdeep -t ./Zurich.trees -m BDSS -o BDSS_model_adequacy.png

# model selection
modeldeep -t ./Zurich.trees -p 0.25 -v CNN_FULL_TREE -o model_selection.csv

# parameter inference
paramdeep -t ./Zurich.trees -p 0.25 -m BDSS -v CNN_FULL_TREE -o HIV_Zurich_BDSS_CNN.csv
paramdeep -t ./Zurich.trees -p 0.25 -m BDSS -v FFNN_SUMSTATS -o HIV_Zurich_BDSS_FFNN_CI.csv -c

Example of output and interpretations

Here, we use an HIV tree reconstructed from 200 sequences, published in Phylodynamics on local sexual contact networks by Rasmussen et al in PloS Computational Biology in 2017, and that you can find at github

The a priori model adequacy check results in the following figures:

BD model adequacy test

BDEI model adequacy test

BDSS model adequacy test

For the three models (BD, BDEI and BDSS), HIV tree datapoint (represented by a red star) is well inside the data cloud of simulations, where warm colors correspond to high density of simulations. The simulations and HIV tree datapoint were in the form of summary statistics prior to applying PCA. All three models thus pass the model adequacy check.

We then apply model selection using the full tree representation and obtain the following result:

Model Probability BDEI Probability BD Probability BDSS
Predicted probability 0.00 0.00 1.00

The BDSS probability is by far the highest: it is the BDSS model that is confidently selected.

Finally, under the selected model BDSS, we predict parameter values together with 95% CIs:

R naught Infectious period X transmission Superspreading fraction
predicted value 1.69 9.78 9.34 0.079
CI 2.5% 1.40 8.12 6.65 0.050
CI 97.5% 2.08 12.26 10 0.133

The point estimates for parameters that are no time related (R naught, X transmission and Superspreading fraction) are well inside the parameter ranges of simulations and thus seem valid.

Preprint

Voznica J, Zhukova A, Boskova V, Saulnier E, Lemoine F, Moslonka-Lefebvre M, Gascuel O (2021) Deep learning from phylogenies to uncover the transmission dynamics of epidemics. bioRxiv

License

The package is available under GPL vs3 license, please refer to its full statement at github

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

phylodeep-0.2.51-py3-none-any.whl (4.8 MB view details)

Uploaded Python 3

File details

Details for the file phylodeep-0.2.51-py3-none-any.whl.

File metadata

  • Download URL: phylodeep-0.2.51-py3-none-any.whl
  • Upload date:
  • Size: 4.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0.post20201207 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.7.9

File hashes

Hashes for phylodeep-0.2.51-py3-none-any.whl
Algorithm Hash digest
SHA256 a0703c6286a6e3137410bad5a18beec2639c206c7a7861997737291e51abe859
MD5 1705c67645d66b3099486ffc38966a16
BLAKE2b-256 7ec8bc47063b12c34b62f1990ad2e0b48621d9bc40be2ffd1be5fa9ebc6e4fdf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page