Skip to main content

Ancestral sequence reconstruction using a tree structured Ornstein Uhlenbeck variational autoencoder

Reason this release was yanked:

old version

Project description

DRAUPNIR: "Beta library version for performing ASR using a tree-structured Variational Autoencoder"

##Extra requirements for tree inference:

IQ-Tree: http://www.iqtree.org/doc/Quickstart

conda install -c bioconda iqtree

RapidNJ: https://birc.au.dk/software/rapidnj

conda config --add channels bioconda
conda install rapidnj

#Extra requirements for fast patristic matrix construction

Install R (R version 4.1.2 (2021-11-01) -- "Bird Hippie" )

sudo apt update & sudo apt upgrade
sudo apt -y install r-base

together with ape 5.5 and TreeDist 2.3 libraries

install.packages(c("ape","TreeDist"))

#Draupnir Install

pip install draupnir

#Example: See Draupnir_example.py

    import pyro
    import torch
    import draupnir
    import argparse
    import os
    script_dir = os.path.dirname(os.path.abspath(__file__))
    pyro.enable_validation(False)
    use_cuda=True
    if use_cuda:
        torch.set_default_tensor_type(torch.cuda.DoubleTensor)
        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    else:
        torch.set_default_tensor_type(torch.DoubleTensor)
        device = "cpu"
    draupnir.available_datasets(print_dict=True)
    build_config,settings_config, root_sequence_name = draupnir.create_draupnir_dataset("simulations_blactamase_1", #default dataset
                                                           use_custom=False, #default dataset
                                                           script_dir=script_dir,
                                                           build=False, # True: construct the dataset, False: use the stored dataset
                                                           fasta_file=None, # in this case, it will be read from /draupnir/src/data
                                                           tree_file=None, #in this case, it will be read from /draupnir/src/data
                                                           alignment_file=None) #in this case, #it will be read from /draupnir/src/data
    #draupnir.draw_tree_simple(args.dataset_name,settings_config) # to draw a tree, only after the dataset has been built
    draupnir.run(args.dataset_name,root_sequence_name,args,device,settings_config,build_config,script_dir)

#How long should I run my model?

  1. While it is training:
    • Check for the Percent_ID.png plot, if the training accuracy has peaked to almost 100%, run for at least ~1000 epochs more to guarantee full learning
    • Check for stabilization of the error loss: ELBO_error.png
    • Check for stabilization of the entropy: Entropy_convergence.png
  2. After training:
    • Observe the latent space:
      1. t_SNE, UMAP and PCA plots: Is it organized by clades? Although, not every data set will present tight clustering of the tree clades though but there should be some organization
      2. Distances_GP_VAE_z_vs_branch_lengths_Pairwise_distance_INTERNAL_and_LEAVES plot: Is there a positive correlation? If there is not a good correlation but the train percent identity is high, it will still be a valid run
    • Observe the sampled training (leaves) sequences and test (internal) sequences: Navigate to the Train_argmax and Test_argmax folders and look for the .fasta files
    • Calculate mutual information:
      • First: Run Draupnir with the MAP & Marginal version and Variational version, or just the Variational
      • Second: Use the draupnir.calculate_mutual_information() with the paths to the folders with the trained runs.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

draupnir-0.0.21.tar.gz (3.6 MB view details)

Uploaded Source

Built Distribution

draupnir-0.0.21-py3-none-any.whl (4.0 MB view details)

Uploaded Python 3

File details

Details for the file draupnir-0.0.21.tar.gz.

File metadata

  • Download URL: draupnir-0.0.21.tar.gz
  • Upload date:
  • Size: 3.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.11

File hashes

Hashes for draupnir-0.0.21.tar.gz
Algorithm Hash digest
SHA256 afcbbc97fbf9b2c79fbe7d55e4e4e3d4a21a47c5ddf0b167a5742eefd8837e63
MD5 dbe93c389aec4b5e8b02b94ff4e0189c
BLAKE2b-256 0739d91393ed2c7f561f66241cc7e6ae5c0016b55c7b88b329fd7e55ca3cab1b

See more details on using hashes here.

File details

Details for the file draupnir-0.0.21-py3-none-any.whl.

File metadata

  • Download URL: draupnir-0.0.21-py3-none-any.whl
  • Upload date:
  • Size: 4.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.7.11

File hashes

Hashes for draupnir-0.0.21-py3-none-any.whl
Algorithm Hash digest
SHA256 e33ab86c469548371da16669e245826f66b1d00bd9f3371fa2701c752323cb4c
MD5 4cf9a4d10d733720650cb0dc3a869044
BLAKE2b-256 fcaa1a85f077e66bc2c8a4fbb8145a67d42fb148e5ccb1a3f591c6e35c3adf36

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page