A python Environment for phylogenetic Tree Exploration
The Environment for Tree Exploration (ETE) is a Python programming toolkit that assists in the automated manipulation, analysis and visualization of phylogenetic trees (although clustering trees or any other tree-like data structure could be used).
ETE is currently developed as a tool for researchers working in phylogenetics and genomics. If you use ETE for a published work, please cite:
Jaime Huerta-Cepas, Joaquín Dopazo and Toni Gabaldón. ETE: a python Environment for Tree Exploration. BMC Bioinformatics 2010, 11:24.
Currently, the following tree data formats are supported, both for reading and writing trees:
- Newick (including several sub-types)
- Extended Newick / New Hampshire Extended (NHX)
With ETE, trees are loaded as a succession of TreeNode objects connected in a hierarchical way. Each TreeNode instance contains methods to operate with it independently. This is, although the top-most TreeNode instance represents the whole tree structure, any child node can be used independently as a subtree instance.
Available (per node) operations include:
- Iteration over descendant or leaf nodes.
- Tree traversing: post-order, pre-order, level-order-
- Search (descendant) nodes by their properties.
- Root / Unroot
- Calculate branch-length and topological distances among nodes.
- Node annotation (add custom features and properties to nodes)
- Automatic tree pruning
- Tree structure manipulation (add/remove parent, children, sister nodes, etc.).
- Newick and extended newick (including annotations) conversion
- shortcuts and checks: “len(Node)”, “for leaf in Node”, “if node in Tree”, etc.
A programmatic tree rendering engine is fully integrated with the Tree objects. It allows to draw trees in both rectangular and circular modes. The aspect of nodes, branches and other tree items are fully configurable and can be dynamically controlled (this is, certain graphical properties of nodes can be linked to internal node values).
Trees can be visualized interactively using a built-in Graphical User Interface (GUI) or exported as PNG images or SVG/PDF vector graphics images.
ETE is not a phylogenetic reconstruction program, as it only provides methods to load, analyze and manipulate phylogenetic results. Thus, a PhyloTree instance is provided, which extends the standard Tree functionality with phylogenetics related methods. Most notably:
- Link trees with Multiple Sequence Alignments (MSAs).
- Automatic detection of species codes within family gene-trees
- Node monophyly checks.
- Orthology and paralogy detection based on tree reconciliation or species overlap.
- Relative dating of speciation and duplication events.
- Combined visualization of trees and MSA.
- Integrated with the phylomeDB database
Full info and documentation can be found at: http://ete.cgenomics.org
ETE requires python>=2.5 (python 3 is not supported) as well as several dependencies (required only to enable particular functions, but highly recommended):
- python-qt4 (>=4.5) [Enables tree visualization and image rendering]
- python-mysqldb (>=1.2) [Enables programmatic access to PhylomeDB]
- python-numpy (Required to work with clustering trees)
- python-lxml [Required to work whit NexML and PhyloXML phylogenetic formats]
- python-setuptools [Optional. Allows to install and upgrade ETE]
ETE is developed and tested under Debian based distributions (i.e. Ubuntu). All the above mentioned dependencies are in the official repositories of Debian and can be installed using a package manager such as APT, Synaptics or Adept. For instance, in Ubuntu you can use the following shell command to install all dependencies:
$ apt-get install python-setuptools python-numpy python-qt4 python-scipy python-mysqldb python-lxml
In Non Debian based distributions, dependencies may not be necessarily found in their official repositories. If this occurs, libraries should be downloaded separately or installed from third part repositories. In general, this process should not entail important difficulties, except for PyQt4, which is a python binding to the new Qt4 libraries. Some distributions (i.e. CentOS, Fefora) do not include recent packages and cross-dependencies for such libraries yet. In such cases, manual compilation of libraries could be required.
Easy Install is the best way to install ETE and keep it up to date. EasyInstall is a python module bundled with setuptools that lets you automatically download, build, install, and manage Python packages. If the easy_install command is available in your system, you can execute this shell command to install/update ETE.
$ easy_install -U ete2
Alternatively, you can download the last version of ETE from http://pypi.python.org/pypi/ete2/, decompress the file and install the package by executing the setup installer:
$ python setup.py install
ETE and all its dependencies are supported by MacOS Intel environments, however you will need to install some libraries from the external GNU/open-source repositories. This can be done easily by using MacPorts.
The following recipe has been reported to work in MacOS 10.5.8 (thanks to Marco Mariotti and Alexis Grimaldi at the CRG):
- Install Mac Developer tools and X11 (required by Macports)
- Install Macports in your system: http://www.macports.org/install.php
- Install the following packages from the macports repository by using the “sudo port install [package_name]” syntax (note that some packages may take a long time to be built and that you will need to have an active internet connection during the installation process): * python26 * py26-numpy * py26-scipy * py26-pyqt4 * py26-mysql * py26-lxml
- Download the setup installer of the last ETE version (http://ete.cgenomics.org/releases/ete2), uncompress it, enter its folder and run: “sudo python setup.py install” Once the installation has finished, you will be able to load ETE (import ete2) when running the “right” python binary.
If step 4 doesn’t work, make sure that the python version your are using to install ETE is the one installed by MacPorts. This is usually located in /opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin/python2.6. By contrast, non-Macport python version is the one located in /Library/Frameworks/Python.framework/Versions/2.6/bin/python2.6, so check that you are using the correct python executable.