Skip to main content

This package was originally created by Andrew Low. Now maintained by Adam Koziol.

Project description

PyPI Documentation Status

StrainChoosr

StrainChoosr examines phylogenetic trees and will give you the X strains that represent the most diversity within your tree. Given today’s deluge of sequencing data, picking strains to do more detailed analysis on can be difficult. Using this tool ensures you’ll have the maximum amount of diversity possible in your scaled down set of sequences. The algorithm behind it is described in Pardi2005 and Steel2005 so please be sure to cite them if you use StrainChoosr.

Installation

StrainChoosr lives on PyPi, so you can install via pip/pip3:

pip install strainchoosr

Alternatively, you can install the latest version from git. Note that there may be breaking changes pushed to the repository, so this isn’t necessarily advised.

pip install git+https://github.com/OLC-Bioinformatics/StrainChoosr.git

Quickstart

To use StrainChoosr, all you need is a newick-formatted tree file. To pick the 5 most diverse strains from a tree, type the following on the command line:

strainchoosr --treefile /path/to/tree.nwk --number 5

This will print the names of the 5 most diverse strains in your tree to the terminal, as well as create a file called strainchoosr_output.html in your current working directory that lets you visualize the output in any web browser.

To do the same within python:

from strainchoosr import strainchoosr
strainchoosr.run_strainchoosr(treefile='/path/to/tree.nwk', number_representatives=[5])

In addition to printing the strains to terminal, run_strainchoosr will return a dictionary where keys are the number of representatives and values are lists of the strains selected for that number of representatives.

Alternatively, if all you want to get is the list of strains and not generated html reports:

from strainchoosr import strainchoosr
import ete3
tree = ete3.Tree('path/to/treefile.nwk')
diverse_strains = strainchoosr.pd_greedy(tree=tree, number_tips=5, starting_strains=[])

This will get you a list of ete3.TreeNode objects that represent the 5 most diverse possible strains. You can then use strainchoosr.get_leaf_names_from_nodes(diverse_strains) to get a list of names.

Complete documentation on the strainchoosr API can be found at https://strainchoosr.readthedocs.io/api.html.

Issues and Pull Requests

If you have any problems or want a feature implemented, please feel free to open an issue. Similarly, if you want to add a feature or otherwise improve things, feel free to open a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

strainchoosr-0.1.6.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

strainchoosr-0.1.6-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file strainchoosr-0.1.6.tar.gz.

File metadata

  • Download URL: strainchoosr-0.1.6.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for strainchoosr-0.1.6.tar.gz
Algorithm Hash digest
SHA256 625d02df96c7c14ef04f08882228160baa918218bec260d50e42e0d6ff24018c
MD5 0cbc102255c91526335f93f26a34e77a
BLAKE2b-256 07ce5dc26536c7660abc2ad15ebe9db1251d26fc417c181be0eb953d7a95a9d3

See more details on using hashes here.

File details

Details for the file strainchoosr-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: strainchoosr-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 13.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for strainchoosr-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 158566ba6e44ca9651f28e189efd10d3a7eb21cd54ea7da3a54831657f360b63
MD5 ce5f27fc1b8f24a0df29c6dde2c26261
BLAKE2b-256 98e4fbe3816746599c13628010bba8ecfac62e44bcd0055c82377ce9e0e24458

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page