Skip to main content

A bioinformatics toolkit for phylogenetic analysis.

Project description

Introduction

Nextstrain is an open-source project to harness the scientific and public health potential of pathogen genome data. We provide a continually-updated view of publicly available data with powerful analytics and visualizations showing pathogen evolution and epidemic spread. Our goal is to aid epidemiological understanding and improve outbreak response.

Resulting data and inferences are available live at the website nextstrain.org. Documentation is available at nextstrain.org/docs.

Augur

Definition: One held to foretell events by omens.

Augur is the bioinformatics toolkit we use to track evolution from sequence and serological data. It provides a collection of commands which are designed to be composable into larger processing pipelines. Documentation for augur is available at nextstrain.org/docs/bioinformatics.

Installation

Augur is written in Python 3 and requires at least Python 3.4. It's published on PyPi as nextstrain-augur, so you can install it with pip (or pip3) like so:

pip install nextstrain-augur

You can also install from a git clone or other copy of the source code by running:

pip install .

If your system has both Python 2 and Python 3 installed side-by-side, you may need to use pip3 or python3 -m pip instead of just pip (which often defaults to Python 2 when both Python versions are installed).

Augur uses some common external bioinformatics programs which you'll need to install to have a fully functioning toolkit:

  • augur align requires mafft

  • augur tree requires at least one of:

  • Bacterial data (or any VCF usage) requires vcftools

Alternatively, all these dependencies (as well as augur itself) can be installed via Conda by running:

conda env create -f environment.yml

Once installed, Conda the enviroment need to be activated whenever augur is to be used, by running:

conda activate augur

Usage

All of Augur's commands are accessed through the augur program. For example, to infer ancestral sequences from a tree, you'd run augur ancestral. If you've installed the nextstrain-augur package, you can just run augur. Otherwise, you can run ./bin/augur from a copy of the source code.

usage: augur [-h] {parse,filter,mask,align,tree,refine,ancestral,translate,clades,traits,sequence-traits,titers,export,validate,version} ...

Augur: A bioinformatics toolkit for phylogenetic analysis.

positional arguments:
  {parse,filter,mask,align,tree,refine,ancestral,translate,clades,traits,sequence-traits,titers,export,validate,version}
    parse               Parse delimited fields from FASTA sequence names into
                        a TSV and FASTA file.
    filter              Filter and subsample a sequence set.
    mask                Mask specified sites from a VCF file.
    align               Align multiple sequences from FASTA or VCF.
    tree                Build a tree using a variety of methods.
    refine              Refine an initial tree using sequence metadata.
    ancestral           Infer ancestral sequences based on a tree.
    translate           Translate gene regions from nucleotides to amino
                        acids.
    clades              Assign clades to nodes in a tree based on amino-acid
                        or nucleotide signatures.
    traits              Infer ancestral traits based on a tree.
    sequence-traits     Annotate sequences based on amino-acid or nucleotide
                        signatures.
    titers              Annotate a tree with actual and inferred titer
                        measurements.
    export              Export JSON files suitable for visualization with
                        auspice.
    validate            Validate a set of JSON files intended for
                        visualization in auspice.
    version             Print the version of augur.

optional arguments:
  -h, --help            show this help message and exit

For more information on a specific command, you can run it with the --help option, for example, augur tree --help.

Development

Development of augur happens at https://github.com/nextstrain/augur.

We currently target compatibility with Python 3.4 and higher. This may be increased to in the future.

Versions for this project from 3.0.0 onwards aim to follow the Semantic Versioning rules.

Running with local changes

From within a clone of the git repository you can run ./bin/augur to test your local changes without installing them. (Note that ./bin/augur is not the script that gets installed by pip as augur; that script is generated by the entry_points configuration in setup.py.)

You can also install augur from source as an "editable" package so that your global augur command always uses your local source code copy:

pip install -e .[dev]

This is not recommended if you want to be able to compare output from a stable version of augur to a development version (e.g. comparing output of augur installed with pip and ./bin/augur from your local source code).

Releasing

New releases are tagged in git using a signed tag. The release branch should always point to the latest release tag. Source and wheel (binary) distributions are uploaded to the nextstrain-augur project on PyPi.

There is a ./devel/release script which will prepare a new release from your local repository. It ends with instructions for you on how to push the release commit/tag/branch and how to upload the built distributions to PyPi. You'll need a PyPi account and twine installed to do the latter.

Travis CI

Branches and PRs are tested by Travis CI jobs configured in .travis.yml.

New releases, via pushes to the release branch, trigger a new docker-base build to keep the Docker image up-to-date.

Older version (original, non-modular)

The older, original version of Augur is non-modular and doesn't use the augur command. It is a set of Python 2 modules which can be used in scripts. Currently, it's still lightly maintained to support older pathogen pipelines for Nextstrain that we have yet to move to the newest version of Augur. However, we're actively migrating our pipelines to use the newest Augur and eventually old Augur will be removed. For the time being, both versions live inside this git repository, which can be confusing.

These are the files associated with new, modular Augur:

augur/
  __init__.py
  align.py
  export.py
  ...
bin/
  augur
setup.py

and these are the files associated with the old Augur:

__init__.py
base/
  auspice_export.py
  colorLogging.py
  config.py
  ...
builds/
  flu/
  avian/
  ...
requirements.txt
requirements-locked.txt
scripts/
  beast-to-auspice-jsons-proof-of-principle.py
  json_tree_to_nexus.py
  plot_msa.py

Old Augur requires writing Python scripts which import functions and classes from the base/ directory. Examples of this are in some of our old pipelines in the builds/ directory; see the prepare.py and process.py scripts.

Running old Augur requires Python 2.7, unlike the latest version of Augur, and dependencies are listed in requirements.txt. You can install them with a package manager like conda or pip:

pip install -r requirements.txt

You may choose to use requirements-locked.txt instead if you'd like a fixed set of known-good packages.

License and copyright

Copyright 2014-2018 Trevor Bedford and Richard Neher.

Source code to Nextstrain is made available under the terms of the GNU Affero General Public License (AGPL). Nextstrain is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
nextstrain_augur-3.0.5.dev1-py3-none-any.whl (93.4 kB) Copy SHA256 hash SHA256 Wheel py3
nextstrain-augur-3.0.5.dev1.tar.gz (82.1 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page